[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: bug xml parser emacs 21.1

From: Alex Schroeder
Subject: Re: bug xml parser emacs 21.1
Date: Mon, 05 Nov 2001 14:12:48 +0100
User-agent: Gnus/5.090004 (Oort Gnus v0.04) Emacs/21.1 (i686-pc-linux-gnu)

address@hidden (Karl Eichwalder) writes:

>> I do not agree.  If you look at the "Document Object Model (DOM) Level
>> 3 Core Specification",
> You'd better check the XML specification.

I did that now and it seems that the XML parse is allowed to do


    [Definition: Comments may appear anywhere in a document outside
    other markup; in addition, they may appear within the document
    type declaration at places allowed by the grammar. They are not
    part of the document's character data; an XML processor may, but
    need not, make it possible for an application to retrieve the text
    of comments.

Specifically, if we decide to write a DOM implementation for XML
documents parsed using xml.el, then we will want comments in the DOM
-- thus xml.el should keep comments in the parsed data.

>> Comments are part of the parsed data.
> Does this mean the XML parser is considered to resolve entities within
> comments?  This would be weird.

Of course this would be weird.  Reading the DOM spec again:

    Object Comment
         Comment has the all the properties and methods of the
         CharacterData object as well as the properties and methods
         defined below.

(Nothing follows "below".)

This means, I believe, the following:

1. A comment is a node, and more important
2. A comment contains data -- a string.

This is based on the following excerpt of the spec:

     The CharacterData object has the following properties:
               This property is of type String, can raise a DOMException
               object on setting and can raise a DOMException object on

>> Perhaps they are not used by the application, but stripping them from
>> the model would be the wrong thing to do.
> No, this is the correct default behavior.  Maybe, there should be a
> switch to pass through comments unmodified.

If there is a switch, then I agree with you.  I do not understand why
the default should be to strip comments, however.

Consider the following cases:

You read an XML document, and write it back into another file.  Do you
want all the comments to be lost?  I don't.

Do you think users expect comments to be unavailable in the parsed XML
document?  Based on my day-time job where we deal with DOM as part of
a J2EE framework, I don't.

Therefore, such a default setting would surprise *me*.  Maybe other
people can relate their expectations so that we may understand why
people expect comments to disappear from the XML document when it gets


reply via email to

[Prev in Thread] Current Thread [Next in Thread]