Re: [igraph] Writing grapheme with weird characters

From: Tamas Nepusz
Subject: Re: [igraph] Writing grapheme with weird characters
Date: Mon, 17 Oct 2011 16:34:33 +0200
Hi Victor,

Technically, the CDATA tag is not required -- if the attribute values
contain characters like "<", ">", "&", "'" or the quotation mark itself,
igraph will escape them using a standard &-based escape sequence. All other
characters should be encoded in UTF-8 and igraph will print them as usual.
E.g., in Python:

>>> g= Graph()
>>> g["name"] = u"\u1234 < > & '"
>>> g.write_graphml("test.graphml")

yields the following GraphML file:

<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns";
<!-- Created by igraph -->
  <key id="name" for="graph" attr.name="name" attr.type="string"/>
  <graph id="G" edgedefault="undirected">
    <data key="name">ሴ &lt; &gt; &amp; &apos;</data>

where the first character of the attribute value is the Unicode character
with code 1234 (in hexadecimal). According to an XML validator at
http://www.validome.org/xml/validate/, the generated file is perfectly valid

Footnote: there is a catch when you load the GraphML file back into igraph
in Python. Since Python has a separate data type for Unicode strings and
"normal" strings, the "name" will be a standard string containing the
original string in UTF-8 encoded form, and you must convert it back to
Unicode manually as follows:

>>> g["name"] = g["name"].decode("utf-8")
>>> g["name"]
u"\u1234 < > & '"


On 10/17/2011 03:59 PM, Víctor Pascual Cid wrote:
> Hi all,
> I need to generate a GraphML which nodes contain some weird characters. The 
> way to deal with strange characters in XML is to use the tag CDATA. However, 
> I haven't seen this possibility with write.graph(g, format="graphml").
> Any hint or workaround to solve this problem?
> Cheers,
> Víctor
