[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-glpk] comments in csv data files

From: Chris Wolf
Subject: Re: [Help-glpk] comments in csv data files
Date: Mon, 17 May 2010 11:29:00 -0400
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv: Gecko/20100317 Thunderbird/3.0.4

As Andrew points out, there is not standard way to embed comments in the CSV 

There are a number of ways you can embed comments in an XML document.  One is 
declare an element or attribute for this purpose in the schema (or DTD).  The 
more general way is to use SGML comments, i.e.  "<!-- this is a comment -->".

The advantage of the the former technique is that the comment is actually not
an XML comment, but an XML-parsable part of the document, which XML processors
can access and pass on in an XML processing pipeline.  

Probably the simplest approach would be to just use SGML style comments, which
are ignored by XML readers in a pipeline - there's no need to use "grep".

I would also advise against depending on a GUI-based editor such as this 
"XMLFox", since it's platform-dependent (Windows only) and requires 
manual interaction in a production workflow.

You are much better off writing a simple XSLT script to process your
XML document to perform any transformations that are needed, such as
re-mapping column order, filtering out columns, converting to CSV, etc.

In this way, running the XSLT is a simple one-liner command on
any platform.

Here an XML=>CSV example:

The command invocation on MacOS/Linux would be:

$ xsltproc tocsv.xsl data.xml  > data.csv

On Windows the invocation would be:

c:\> msxsl data.xml tocsv.xsl > data.csv

(Note the command arguments are reversed with "msxsl")

I cleaned up the XSL script a little:


<xsl:stylesheet version="1.0" xmlns:xsl="";>
  <!-- Convert XML to CSV. Assumes only one level of nesting. -->
  <xsl:output method="text" encoding="iso-8859-1"/>
  <xsl:strip-space elements="*" />

  <xsl:template match="/*/child::*">
    <xsl:for-each select="child::*">
      <xsl:if test="position() != last()">"<xsl:value-of 
      <xsl:if test="position() = last()">"<xsl:value-of 

If you really want DOS line endings in your CSV result, you can replace "&#xA;" 
with "&#xD;&#xA;"

This script can be easily extended to re-map and/or filter columns.


Chris Wolf

On 5/17/10 10:40 AM, Nigel Galloway wrote:
> I might have mentioned the benefits of XML before. It has been helpfully 
> pointed out that a program may be written in grep to convert csv to XML. 
> Careful research has revealed that Excel can read csv and save it as XML.
> How then to include the useful comments.
> It would be possible to include the comments and then write a program in grep 
> to remove them.
> One of the benefits of XML is that someone may already have done this for 
> me!!!!
> You are looking to create an Excel file by selecting only some of 
> the Columns from an XML. The columns are not is the correct 
> sequence for the Excel so they need to be mapped.
> You have an XML that contains columns A, B, C, D, E, F, etc.
> You only need data from columns B, D, and F.
> But in the output Excel file the sequence has to be F, B, and D.
> To accomplish the task we will use XMLFox Advance that is a useful 
> XML and XSD schema editor. By using XMLFox Advance you can output 
> data to several other data format files. The Editor allows you 
> export XML tables or whole XML to the following data files: TXT; 
> upload XML into MS SQL Server database, convert into CSV (Comma 
> Separated Value) file, convert into HTML page, create MS Access 
> database, upload XML into SQL Server database, convert to PDF, and 
> create Excel file.
> Full details of the above (reproduced without any permission) may be found at:
>> ----- Original Message -----
>> From: Andrew Makhorin <address@hidden>
>> To: address@hidden
>> Subject: [Help-glpk] comments in csv data files
>> Date: Sun, 16 May 2010 13:36:57 +0400
>> I found that it would be convenient to allow comment lines in csv data
>> files read from mathprog models thru the table statement. Unfortunately,
>> the RFC document 4180 that specifies the csv format says nothing about
>> such a feature.
>> Probably, a comment line can be indicated by its very first character
>> (like in many scripting languages): '#', ';', '*', or may be '%'.
>> Another issue is whether to allow comment lines everywhere in the file
>> or only in the beginning. The latter seems safer, because the first line
>> contains field names which, as a rule, contain no special characters.
>> Any opinions/suggestions are appreciated. Thanks.
>> Andrew Makhorin
>> _______________________________________________
>> Help-glpk mailing list
>> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]