[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: smtpmail and ~/.authinfo

From: Eli Zaretskii
Subject: Re: smtpmail and ~/.authinfo
Date: Tue, 27 Sep 2011 07:31:23 -0400

> From: Ted Zlatanov <address@hidden>
> Date: Tue, 27 Sep 2011 05:38:28 -0500
> Reply-To: address@hidden
> On Tue, 27 Sep 2011 05:57:28 +0300 Eli Zaretskii <address@hidden> wrote: 
> >> From: Stefan Monnier <address@hidden>
> >> Date: Mon, 26 Sep 2011 17:31:52 -0400
> >> 
> >> I think raw-text is more likely to work, based on what Lars says.
> EZ> That was also my conclusion.
> I think we should make an effort to make the netrc/authinfo file
> shareable with other programs

I agree.  But to do that, it sounds like we are lacking some knowledge
about the intended use of these files, especially when they are used
in conjunction with external services.  If someone can prepare an
exhaustive list of such uses, or at least those we want to support,
and tell what encodings can be used with each of them, we can take it
from there the way you want it.  But if such details are not known at
the moment, we may actually break some legitimate uses, which would be
a pity.

> raw-text encoding is, to me, saying "we give up."

Give up knowing exactly how the stuff is encoded, yes.  There's
nothing wrong with that; after all, we do that when we edit binary
files, don't we?

> I thought today, on most popular platforms, UTF-8 was the safest choice
> if you want to share data that covers UCS.

UCS and UTF-8 are not the same thing.  Windows uses UCS (well,
actually UTF-16) internally, but UTF-8 is seldom seen there, e.g. you
will never see a file name encoded in UTF-8 on a Windows filesystem,
except as an accident.

Stephen gave you examples with CJK locales, where UTF-8 might not be
as popular as you'd like it, even on Posix systems.

And even in Europe there are a few locales which prefer single-byte
encoding of some kind, AFAIK.

So I think you are being overly optimistic in asserting that UTF-8 is
"the safest choice".

> The other objection to UTF-8 was that some binary sequences can't be
> encoded by it.  Remember, we're talking about passwords and other
> legible tokens, not binary files.  The likelihood of such a sequence in
> a token is too small to matter IMO.  So I still think raw-text is the
> worse choice even though it's easier to make it.

You read "binary" incorrectly.  For the purposes of this discussion,
"binary" == "arbitrary byte values".  Not every 8-bit byte is valid as
part of a UTF-8 sequence.  If the authinfo file includes such bytes,
it cannot be encoded in UTF-8, except if we use the Emacs extensions,
which will be only useful for Emacs.  Such bytes can easily come from
some single-byte encoding, for example.  To DTRT with such bytes, we
_must_ know its precise encoding; then we could _recode_ it in UTF-8,
and encode back when we send the string to external services.

Once again, blindly assuming that UTF-8 is "safe" is not good enough,
IMO.  We need more details, if someone can provide them.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]