[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: auto-detecting encoding for XML

From: Colin Walters
Subject: Re: auto-detecting encoding for XML
Date: 20 May 2002 03:04:56 -0400

On Sun, 2002-05-19 at 19:13, Stefan Monnier wrote:
> >     * international/mule.el (auto-coding-functions): New variable.
> Why not extend auto-coding-regexp-alist so it can associate a regexp
> to a function (rather than a coding-system) ?

Hm.  It seems cleaner to just have the function do the searching in the
first place, instead of in this case matching against a regexp, then
callling a function which will probably have to do the same searching...

> Or why not do what po.el does (i.e. use file-coding-system-alist) ?
> Admittedly, the file-coding-system-alist approach is pretty
> hairy/heavy-weight.

Well, it also has the disadvantage in this case that it depends on file
extensions; XML tends to be used as an encoding for other types of
files, which use their own extension.  So using file names as a way to
detect XML is probably a bad approach.

Just as a random sample on my system:

~/.gconf/* contains XML files, and their extension is .xml.
/etc/oglerc is an XML file, but doesn't have an extension at all.
~/local-cvs/resume/resume.fo is an XML file.
.nautilus-metafile.xml is XML.
/foreign-cvs/cvs.gnome.org/evolution/views/mail/Messages.galview is XML.
/foreign-cvs/cvs.gnome.org/evolution/views/mail/galview.xml is XML.

So only about 50% of the XML files have an "obvious" extension like

> In any case we should come up with some way to do those things conveniently,
> because it applies to po-mode, to sgml-mode to tex-mode and probably
> a lot more.

auto-coding-functions should be able to handle those.

> Note that these are always associated with a mode, so
> it would be good if the implementation also was mode-specific so
> that it automatically works if you open an xml file called
> foo.myxmlextension (as long as "\\.myxmlextension\\'" is in the
> auto-mode-alist).

Yes.  It's very tricky though.  We can't possibly cover all the file
name extensions that would be used for XML.  I agree that it would be
great if we had a way to associate it with a mode.  The problem with
that though is that by the time the major mode function is called, the
file will have already been read from disk, and the only way to change
the coding system is to reread it from disk (as I understand things). 
And doing that in a major mode function is kind of a hack.  Maybe that's
the best solution, but auto-coding-functions certainly does the trick
here, and it seems to be extensible to handle the analogous po-mode and
tex-mode problems.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]