[PATCH] Fix of upstream parsing of CDATA

From: Linus Björnstam
Subject: [PATCH] Fix of upstream parsing of CDATA
Date: Thu, 16 Jan 2020 13:00:25 +0100
Hello Guilers!

RhodiumToad found an error in sxml where it would not properly parse CDATA: &gt 
would be converted to > inside CDATA blocks. This is probably due to some wrong 
reading of the XML spec:

    "Within a CDATA section, only the CDEnd string is recognized as markup, so 
that left angle brackets and ampersands may occur in their literal form; they 
need not (and cannot) be escaped using ' < ' and ' & '.".

Notice that it mentions that only CDEnd is recognized, but omitts > in the 
enumeration of things that need-not-and-cannot be escaped. 

No other XML libraries behave this way. Take for example python's Etree:

Python 2.7.17 (default, Dec 23 2019, 21:25:33)
>>> import xml.etree.ElementTree as ET
>>> root = ET.fromstring("<e><![CDATA[&gt;]]></e>")
>>> root.text

The same thing with the un-patched (sxml ssax) (or rather (sxml simple)): looks 

(xml->sxml "<e><![CDATA[&gt;]]></e>")
;; => (*TOP* (e ">"))

The question is whether this patch should be sent upstream. Since there has 
been very little activity there, I suspect it is a lost cause.

Failing tests have been looked through, verified and fixed. No unexpected 
errors were encountered. All SXML tests pass after this patch.

Best regards
  Linus Björnstam

