[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Automatic recognition of some specific coding systems

From: Jürgen Hartmann
Subject: RE: Automatic recognition of some specific coding systems
Date: Wed, 25 Feb 2015 18:53:39 +0100

Thank you, Eli Zaretskii, for repetitively digging into this problem:

>> encoded and its contents is displayed as
>>    \204\224\201\341\216\231\232
> That's true, but I see the same behavior in Emacs 22.3, if I invoke it
> with "emacs -q" (lowercase 'q', since 22.x didn't support -Q), so
> there's no change in behavior here.

That is right: I had to do some minor configuration to get Emacs 22.3
to correctly recognize these three coding systems. See below.

> How exactly did you verify with v22.3?  As I wrote above, I see the
> same behavior in that version.  Did you invoke it with -q?  If not,
> there are some customization of yours that modify the default
> behavior, and the question becomes how to express the same
> customizations in Emacs 24.

To set up a clean stage, I just recompiled Emacs 22.3 from the vanilla
Gnu sources, and started one session with -q and another with -Q,
receiving the same result in both cases.

For the tests I used the same sample text files


that I described in my previous post.

As you already described, without any customization the automatic
recognition fails in the case of the cp850-dos encoded text file, as
its coding is recognized as raw-text-dos. So far we get the same
result as in the Emacs 24.4 case.

But if one issues the commands

   (check-coding-system 'cp850)
   (setq coding-category-ccl 'cp850)

in the *scratch* buffer (Lisp Interaction mode) of Emacs 22.3 right
after starting the session, all three coding systems will be perfectly
recognized when the text files are visited.

After this customization, the contents of the variable
coding-category-list has the form

   (coding-category-utf-8 coding-category-iso-8-1 coding-category-ccl ...)

where the values of the variables coding-category-utf-8,
coding-category-iso-8-1, and coding-category-ccl are mule-utf-8,
iso-latin-1, and cp850 respectively.

You are perfectly right stating that the question to be addressed now
is how to port these customization commands to the contemporary
version 24.4 of Emacs: In that version the coding system cp850 is not
any more implemented via CCL and it is associated with the coding
category coding-category-charset--the same category that the systems
latin-1 and latin-9 are associated with. Furthermore, the command
update-coding-systems-internal is not available any more, but this
might be a minor detail.

I am rather clueless here, so any help is most welcome.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]