emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

po file charset via auto-coding-functions


From: Kevin Ryde
Subject: po file charset via auto-coding-functions
Date: Fri, 21 Oct 2005 07:06:49 +1000
User-agent: Gnus/5.110004 (No Gnus v0.4) Emacs/21.4 (gnu/linux)

This is a proposal to get the coding system for a .po file via
auto-coding-functions, instead of the way textmodes/po.el reads the
file explicitly.

2005-10-20  Kevin Ryde  <address@hidden>

        * international/mule.el (po-content-type-charset-alist): Moved from
        textmodes.el, add "CHARSET" which is a placeholder from xgettext.
        (po-auto-coding-function): New function.  This gets the right coding
        system when visiting a .po via archive-mode; po-find-file-coding-system
        only worked on a normal file.  charset= regexp from textmodes/po.el.
        (auto-coding-functions): Use po-auto-coding-function.
        * international/mule-conf.el (file-coding-system-alist): Remove
        po-find-file-coding-system.
        * textmodes/po.el: Remove file, no longer used.


One possible problem is that po files can have more than 1024 bytes of
comments before the header info block.  I see fileio.c
Finsert_file_contents only grabs 1024 bytes before calling
set-auto-coding, but I can't tell if/when that happens.  I think a
normal visit or an `archive-extract' has the whole file, so they work.


I used the following bit of code to exercise po-auto-coding-function
on all my .po files.  The function prints messages about bad charsets,
the result is a list of the bad files.

(delq nil
      (mapcar (lambda (filename)
                (with-temp-buffer
                  (insert-file-contents-literally filename)
                  (goto-char (point-min))
                  (if (po-auto-coding-function (- (point-max) (point-min)))
                      nil
                    filename)))
              (delete "" (split-string
                          (shell-command-to-string "locate \\*.po") "\n"))))

Among my files I found two unrecognised:

"TCVN-5712" in gtk 1.2 vietnamese.  Is there a good place to map or
alias that to `tcvn' which emacs knows?

"iso-8859-9e" in gtk 1.2 Azerbaijani turkish, but I don't know what
that charset is or is meant to be.  glibc iconv doesn't seem to
recognise it, so presumably it's unused.


Attachment: mule.el.po-coding.diff
Description: Text document

Attachment: mule-conf.el.po-coding.diff
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]