gnustep-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unresolved Issues with libxml2


From: Richard Frith-Macdonald
Subject: Re: Unresolved Issues with libxml2
Date: Thu, 1 Mar 2012 10:08:08 +0000

On 1 Mar 2012, at 09:43, Fred Kiefer wrote:

> On 01.03.2012 07:34, Richard Frith-Macdonald wrote:
>> On 29 Feb 2012, at 22:28, Doug Simons wrote:
>>> Since we've submitted the new implementation of the NSXML...
>>> classes based on libxml2 and people are beginning to use them, I
>>> thought I would mention some remaining unresolved issues in the
>>> hope that other people might have more experience with the libxml2
>>> libraries and have some ideas about how to solve them. These are
>>> currently the top issues on my list:
>>> 
>>> 1. Parsing an XML document generates text nodes in the tree for
>>> whitespace between elements even when the XML_PARSE_NOBLANKS
>>> option is given.  Cocoa doesn't do this.
>> 
>> Try setting the keepBlanks field in the parser ... I looked at the
>> GSXML source in the base library, and that seems to be what it does.
> 
> The XML parsing code here isn't using GSXML. (See [NSXMLDocument
> -initWithData:options:error:])

Sure ... but both NSXML... and GSXML... use libxml2 to do the parsing ... so it 
seems reasonable that NSXML... could ask libxml2 to do things in the same way 
that GSXML does.

>>> 2. Find a way to control formatting of "empty" nodes. Cocoa has
>>> the options NSXMLNodeExpandEmptyElement and
>>> NSXMLNodeCompactEmptyElement to control whether an empty node foo
>>> is displayed as "<foo></foo>" or as"<foo/>" . Currently our
>>> libxml2 implementation will only display the latter.
>> 
>> I think setting xmlSaveNoEmptyTags should control this.
>> 
>>> 3. Find a way to control "pretty-print" formatting. Cocoa has an
>>> option NSXMLNodePrettyPrint to control whether a string
>>> representation of a tree will include indentation for enhanced
>>> readability or not. Currently our libxml2 implementation always
>>> includes indentation.
>> 
>> Looking at GSXML, it calls xmlDocDumpFormatMemoryEnc to dump output,
>> and the last argument of the function should control whether
>> indenting is done or not.  There's also xmlDocDumpMemoryEnc which
>> might produce output with different/no formatting.  I modified the
>> code to try setting the flag to control formatting, but I haven't
>> written a test case for it (sorry ... ran out of time for now ...
>> need to get children out of bed and to school).
> 
> Maybe the unification of the two XML implementations (GSXML in Additions
> and all the NSXML classes) should be the next step?

No ... they wrap libxml2 in quite different ways, so the there's no point 
trying to reimplement one in terms of the other (there might have been some 
point to implementing the NSXML classes on top of the GSXML classes in order to 
get them working more quickly ... but it would have been much more effort in 
the long run).

> I don't know about the state of the GSXML classes. are they actual usable and 
> in use?

Yes, they've been in daily commercial use since they were first written (they 
might not be used much in free apps though). 

> In the file GSXML.m there seem to be rather unrelated concepts like the 
> XMLRPC stuff. One goal could be to extract that code and get it working with 
> NSXML classes.

Again I don't see any point in that since Apple don't really support it.  We 
have working XMLRPC and XSLT as part of GSXML ... why change it?  Once the new 
classes are ready for real world use, we can deprecate the GSXML methods and, 
after a few years, remove them.

We also have a separate XMLRPC implementation in the WebServices library if you 
want something more lightweight (it doesn't use libxml2).  Since it does't 
need/use libxml2 there's no need to build something round a libxml2 wrapper 
like NSXMLNode.

>> But these seem rather cosmetic issues ... to my mind the place that
>> really needs work is defining an object ownership model and
>> implementing memory management correctly.  I recall that this was
>> about the hardest part of writing the original XML DOM support for
>> gnustep-base several years ago, and it's something we need to get
>> sorted out for NSXMLNode before people start using it seriously.
> 
> I fully agree here. Although I am not sure I like the old GSXML 
> implementation. To me it seems that there we get a new document each time we 
> ask a node for its document. And the same seems to be true for all the tree 
> walking methods. Now what happens if a user retains such an object? Shouldn't 
> it now keep its tree alive?

Yes, the GSXML classes are a totally different design and intended to be used 
in a somewhat different way (and I don't think they ever got memory management 
right ... just 'good enough' to be able to avoid leaking).

> What we need here is more test cases.

Agreed.  In particular we need a lot of tests to run on Apple systems to find 
out exactly how their object ownership model works.  What happens when you 
create documents and then release them while holding references to various 
nodes within them?  Does behaviour vary depending on how documents are created? 
 What about standalone nodes without documents?  What about namespaces?!!





reply via email to

[Prev in Thread] Current Thread [Next in Thread]