bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35507: Gnus mojibakifies UTF-8 text/x-patch attachments from Thunder


From: Eli Zaretskii
Subject: bug#35507: Gnus mojibakifies UTF-8 text/x-patch attachments from Thunderbird
Date: Thu, 02 May 2019 14:04:26 +0300
User-agent: K-9 Mail for Android

On May 2, 2019 10:17:51 AM GMT+03:00, Andy Moreton <address@hidden> wrote:
> On Wed 01 May 2019, Noam Postavsky wrote:
> 
> > Eli Zaretskii <address@hidden> writes:
> >
> >>> From: Andy Moreton <address@hidden>
> >>> Date: Wed, 01 May 2019 17:42:18 +0100
> >>> 
> >>> +              (mm-decode-string text 'utf-8))))
> >>
> >> As I said, I'm not sure we should do this, let alone
> unconditionally
> >> force UTF-8 here, but if we must, why not use decode-coding-string?
> >> Do we really need the mm-* stuff?
> >
> > As far as I can tell, the mm-* version is useful for handling stuff
> lke
> > "UTF-8" as the charset argument (which might be useful if we extract
> it
> > from the "Content-Type: text/plain; charset=UTF-8" header).  If
> passing
> > 'utf-8, then it's just the same as calling decode-coding-string.
> 
> OK, in that case we could indeed just call decode-coding-string.
> 
> > For a default if we don't find a charset header, I guess `undecided'
> > would make more sense, right?  After all, Emacs already has the
> coding
> > detection machinery, may as well use it.
> 
> Please re-read the original bug report: the problem is with malformed
> messages that do not contain a charset field in the Content-Type
> header.
> 
> The one-liner patch changes the default for inline display in the
> Gnus article buffer to assume UTF-8 when nothing is specified, rather
> than just inserting the text without decoding it.
> 
> That should result in text that actually is UTF-8 being displayed
> correctly, and no change to plain ASCII. For anything else, the user
> can
> use the `gnus-mime-view-part-as-charset' command to override the
> default.
> 
>     AndyM

Using 'undecided' doesn't disable decoding, it just means Emacs will try to 
detect the correct encoding by looking at the text (not at the charset header). 
 In a UTF-8 locale, we will guess UTF-8 anyway, unless we see invalid sequences.

So yes, I think Noam is right, and 'undecided' is a better alternative here.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]