bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #51330] preconv fails to detect utf-8 without BOM


From: Werner LEMBERG
Subject: [bug #51330] preconv fails to detect utf-8 without BOM
Date: Wed, 28 Jun 2017 04:58:27 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36

Follow-up Comment #1, bug #51330 (project groff):

I like the idea of using a library to guess the charset and encoding. 
However, I think that libmagic is not suited to that – as far as I can see,
it returns a textual description of the data that preconv had to parse
manually.  Please correct me if I'm wrong.

Looking around, the probably best choice is uchardet:

https://www.freedesktop.org/wiki/Software/uchardet/

We could make preconv use it optionally if it is available.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?51330>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]