[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 23.0.60; Defaut encoding for XML files should be undefined (instead

From: Stephen J. Turnbull
Subject: Re: 23.0.60; Defaut encoding for XML files should be undefined (instead of utf-8)
Date: Sat, 16 Feb 2008 20:23:43 +0900

Jason Rumney writes:
 > Stephen J. Turnbull wrote:
 > > Stefan Monnier writes:

 > > > [Typically this user is dealing with a fragment of a larger
 > > > document, not the whole document.]

 > > Man, don't go there.  XML documents *and fragments* should be presumed
 > > Unicode until the user explictly says otherwise.

 > I think you're misunderstanding.

No, I'm not.  Look at the subject.

 > Currently we use utf-8 in absense of a coding tag, even if it
 > causes a decoding error.

Signaling an error in this situation is appropriate.  Guessing what is
meant (eg, by falling back to undecided) is not.  It's quite possible
that the user is unintentionally in a Latin-1 environment and would
thank you if you reminded them that they should save in UTF-8.

 > And when the user explicitly sets the file-coding-system to
 > latin-1, we ignore it and save as utf-8.

PSGML and nXML were both written by James Clark; if they try to
enforce Unicode, I'd suggest that maybe somebody who knows more about
XML than all of us put together made that decision.

Of course in the end the users have reasons whereof OASIS does not
know, so that there must be escapes for users with legacy documents,
or with legacy document standards.  But users should be strongly
encouraged to use XML mechanisms, *not* those of Mule, to cope.  Mule
is designed to cope with environments where there are few rules and
those are poorly understood by programmers and users alike.  XML is
about having good rules, well understood by programmers (especially of
the UI) so that users don't have to.

PSGML and AUCTeX, at least, provide methods by which a master document
can be associated with a document fragment which provides various
kinds of context for the fragment -- it's not rocket science.  I would
imagine that nXML does too.  So, for example, upon detecting a coding
conflict, Emacs could offer to (1) insert an appropriate processing
instruction, or (2) associate the current fragment with an existing
master document via file locals, or (3) associate the fragment with a
dummy master document that lives entirely in Customize.  Those
documents could provide other context too, such as importing DTDs and

reply via email to

[Prev in Thread] Current Thread [Next in Thread]