[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: smtpmail and ~/.authinfo

From: Ted Zlatanov
Subject: Re: smtpmail and ~/.authinfo
Date: Tue, 27 Sep 2011 15:15:24 -0500
User-agent: Gnus/5.110018 (No Gnus v0.18) Emacs/24.0.90 (gnu/linux)

On Tue, 27 Sep 2011 21:33:37 +0900 "Stephen J. Turnbull" <address@hidden> 

SJT> Ted Zlatanov writes:
>> UTF-8 is an encoding; you're talking about charsets.

SJT> No, I'm talking about encodings.  I'm not entirely sure about GB 2312,
SJT> but I believe it has a defined preferred encoding (the one registered
SJT> as the MIME charset GB2312 -- MIME charsets are all encodings, they
SJT> specify what *bytes* will appear in the stream, not just an abstract
SJT> character to abstract integer mapping).  Shift JIS is most definitely
SJT> an encoding for the JIS character set (although which JIS character
SJT> set is poorly defined).

Thanks for correcting my misunderstanding.

SJT> If you already have a password, it should be read verbatim (binary, or
SJT> raw-text should do given the line-oriented nature of these
SJT> configuration files) and treated as a binary blob.
>> That's not helpful when you need to encode it for IMAP, for instance.
>> You have to know the actual characters that make up the binary blob.

SJT> Since when?  I haven't paid much attention to IMAP since RFC 3501 was
SJT> an internet-draft, but in that document there are a few commands that
SJT> accept a CHARSET parameter.  LOGIN and AUTHENTICATE aren't among them.
SJT> So you're just passing along binary blobs, which in the case of LOGIN
SJT> will often look like somebody's birthday or a child's name, but that's
SJT> just an unfortunate accident.

Ditto.  I thought the CHARSET was used for passwords.

On Tue, 27 Sep 2011 07:31:23 -0400 Eli Zaretskii <address@hidden> wrote: 

>> From: Ted Zlatanov <address@hidden>
>> Date: Tue, 27 Sep 2011 05:38:28 -0500
>> Reply-To: address@hidden
>> On Tue, 27 Sep 2011 05:57:28 +0300 Eli Zaretskii <address@hidden> wrote: 
>> >> From: Stefan Monnier <address@hidden>
>> >> Date: Mon, 26 Sep 2011 17:31:52 -0400
>> >> 
>> >> I think raw-text is more likely to work, based on what Lars says.
EZ> That was also my conclusion.
>> I think we should make an effort to make the netrc/authinfo file
>> shareable with other programs

EZ> I agree.  But to do that, it sounds like we are lacking some knowledge
EZ> about the intended use of these files, especially when they are used
EZ> in conjunction with external services.  If someone can prepare an
EZ> exhaustive list of such uses, or at least those we want to support,
EZ> and tell what encodings can be used with each of them, we can take it
EZ> from there the way you want it.  But if such details are not known at
EZ> the moment, we may actually break some legitimate uses, which would be
EZ> a pity.

I know for sure only ASCII (up to 0xff) is supported by libcurl and
older FTP clients.  I thought UTF-8 would be a good compatibility path
but apparently I'm wrong.

EZ> So I think you are being overly optimistic in asserting that UTF-8 is
EZ> "the safest choice".


EZ> You read "binary" incorrectly.  For the purposes of this discussion,
EZ> "binary" == "arbitrary byte values".  Not every 8-bit byte is valid as
EZ> part of a UTF-8 sequence.  If the authinfo file includes such bytes,
EZ> it cannot be encoded in UTF-8, except if we use the Emacs extensions,
EZ> which will be only useful for Emacs.  Such bytes can easily come from
EZ> some single-byte encoding, for example.  To DTRT with such bytes, we
EZ> _must_ know its precise encoding; then we could _recode_ it in UTF-8,
EZ> and encode back when we send the string to external services.

Got it.

On Tue, 27 Sep 2011 08:55:45 -0400 Stefan Monnier <address@hidden> wrote: 

SM> Here's my take on it:
SM> .authinfo contains various things and is used in different ways, and
SM> there isn't a single answer that covers all cases:
SM> - each kind of field (hostname, username, password) may require
SM>   a different encoding/decoding.
SM> - when reading a password from the file, it should be read using
SM>   raw-text (i.e. as a "unibyte string").
SM>   In other words, the password should not be decoded into chars but left
SM>   as a sequence of bytes that will be sent as-is to whoever needs it.
SM> - when a password is typed by the user it'll be a sequence of chars, so
SM>   we'll have to convert it into a sequence of bytes.  The best coding
SM>   system to use for that purpose is probably going to be
SM>   locale-coding-system.  That sequence of bytes is then send to whoever
SM>   needs it and saved as-is (using raw-text) into the .authinfo file.
SM> - i.e. authinfo should be read as a unibyte file.
SM> - i.e. when reading other fields than passwords, we'll have to
SM>   explicitly decode them using the coding system we want to use for
SM>   those fields.
SM> - similarly, we'll have to encode those other fields manually when
SM>   writing them into .authinfo.

SM> Of course, another option is to just read&write authinfo without
SM> thinking about it, so Emacs will usually pick locale-coding-system for
SM> it and it'll work just fine in 99.9% of the cases.

It sounds like the latter option is the least work and most reliable.
Users should be able to override the coding system as with any other
file, and we'll just keep the status quo.  I appreciate all the details
and corrections; I thought UTF-8 was better and more widely useful than
it really is.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]