guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Converting s-expressions to XML


From: address@hidden
Subject: Re: Converting s-expressions to XML
Date: Thu, 17 Jun 2010 15:21:08 -0400

Here's an excerpt from my sexp to XML interface for Proverb:

  Generating XML in Scheme


Proverb lets you generate and present XML-valued objects from Scheme with little fuss. 
The most crude interface is the procedure make-raw . This procedure is called with a valid XML fragment as a string and returns a new XML-valued object whose contents are that fragment.

(make-raw "<em>An Emphasized Phrase</em>") yields An Emphasized Phrase .

Any Scheme value can be converted into an XML fragment as a string with xmlize-value .

(xmlize-value (make-raw "<em>An Emphasized Phrase</em>")) yields "<em>An Emphasized Phrase</em>" .

And Scheme values can be converted into an XML document as a string with xmlize .

(xmlize (make-raw "<em>An Emphasized Phrase</em>")) yields "<?xml version=\"1.0\"?>
<?xml-stylesheet type=\"text/xsl\" href=""><sexp xmlns=\"http://www.w3.org/2000/sexp\">
<em>An Emphasized Phrase</em></sexp>
"
 .

It is unsurprising to see that the text supplied to make-raw is passed through into the ultimate XML unmolested.

We can see values like strings and lists are adulterated but not quite molested.

(xmlize-value "foo") yields "<string>foo</string>"
(xmlize-value (list "foo")) yields "<list><string>foo</string></list>"

Most users and procedures that generate XML values neither explicitly generate these corresponding XML string fragments nor embed XML string fragments directly in their programs.

Scheme values are transformed automatically into XML for the Proverb web interface and XML-valued objects can be generated more elegantly with the procedure with-xml-tag .

(with-xml-tag 'em (make-raw "A Better Way")) yields A Better Way

That example certainly looks a bit more like Scheme but papers over an irritating detail. The call to make-raw expects a string laden with XML markup. Any XML characters in the text would need to be escaped with entity codes. We would need to write:

(with-xml-tag 'b (make-raw "&lt;-o->")) just to get Darth Vader's TIE fighter <-o->

String escaping always ends in tears. This also ends in tears but for a better reason:

(with-xml-tag 'b "<-o->") yields "<-o->"

When we generate XML for Scheme values that are not naturally XML, like Scheme strings, we go to some effort to provide the type of print fidelity provided by Scheme write . We put enough markup around these values to allow them to be distinguished in the browser. This decorates our  classy ASCII with extra quotes and other styling.


We square this circle with the helper procedure make-text that generates free XML text from its string argument.

(with-xml-tag 'b (make-text "<-o->")) yields <-o-> after too much work!

A better way to square circles generally would be with SVG: 

(with-xml-tag '(svg
                  xmlns "http://www.w3.org/2000/svg"
                  width 50
                  height 50
                  viewbox "0 0 50 50")
  (with-xml-tag 'g
    (with-xml-tag '(rect width 50 height 50 fill green))
    (with-xml-tag '(circle cx 25 cy 25 r 25 fill red))
  )
)

which yields  OMITTED FROM THIS EMAIL

And which introduces a new detail of with-xml-tag . This procedure accepts as its first argument either a symbol naming an XML tag or a list naming an XML tag and encoding a number of XML tag attribute keys and values. 

---------

The five referenced functions and syntax forms:
xmlize
xmlize-value
with-xml-tag
make-raw
make-text

have pretty straightforward definitions. The largest, and the one that most resembles your example kernel, is xmlize-value. Its definition is provided below:

(define xmlize-value (lambda (value)
  (cond ((and (vector? value) (> (vector-length value) 1))
         (cond ((eq? (vector-ref value 0) xml-xhtml)
                (string-append
                  "<xhtml>"
                  (apply string-append
                         (map xmlize-value (vector-ref value 1)))
                  "</xhtml>"))
               ((eq? (vector-ref value 0) xml-raw)
                (vector-ref value 1))
               ((eq? (vector-ref value 0) xml-text)
                (let* ()
                  (letrec ((p (object-property value (quote charset))))
                    (if p
                      (string-append
                        "<text charset='"
                        (xml-escape-string p)
                        "'>"
                        (xml-escape-string (vector-ref value 1))
                        "</text>")
                      (string-append
                        "<text>"
                        (xml-body-escape-string (vector-ref value 1))
                        "</text>")))))
               (else
                (apply string-append
                       "<vector>"
                       (append
                         (map xmlize-value (vector->list value))
                         '("</vector>"))))))
        ((vector? value)
         (apply string-append
                "<vector>"
                (append
                  (map xmlize-value (vector->list value))
                  '("</vector>"))))
        ((null? value) "<list/>")
        ((list? value)
         (cond ((and (eq? (car value) (quote quote))
                     (eq? (length value) 2))
                (string-append
                  "<quote>"
                  (xmlize-value (cadr value))
                  "</quote>"))
               ((and (eq? (car value) (quote quasiquote))
                     (eq? (length value) 2))
                (string-append
                  "<quasiquote>"
                  (xmlize-value (cadr value))
                  "</quasiquote>"))
               (else
                (apply string-append
                       "<list>"
                       (append
                         (map xmlize-value value)
                         '("</list>"))))))
        ((pair? value)
         (string-append
           "<pair>"
           (xmlize-value (car value))
           (xmlize-value (cdr value))
           "</pair>"))
        ((hash-table? value)
         (apply string-append
                "<hash>"
                (hash-fold
                  (lambda (k v acc)
                    (cons (xmlize-value (vector k v)) acc))
                  '("</hash>")
                  value)))
        ((string? value)
         (let* ()
           (letrec ((p (object-property value (quote charset))))
             (if p
               (string-append
                 "<string charset='"
                 (xml-escape-string p)
                 "'>"
                 (xml-escape-string value)
                 "</string>")
               (string-append
                 "<string>"
                 (xml-body-escape-string value)
                 "</string>")))))
        ((symbol? value)
         (string-append
           "<symbol name=\""
           (xml-quote-escape-string (symbol->string value))
           "\"/>"))
        ((boolean? value)
         (if value
           "<boolean value=\"true\"/>"
           "<boolean value=\"false\"/>"))
        ((number? value)
         (string-append
           "<number value=\""
           (number->xml-string value)
           "\"/>"))
        ((record? value)
         (string-append
           "<text>"
           (xml-body-escape-string (object->string value))
           "</text>"))
        ((unspecified? value) "<unspecified/>")
        ((char? value)
         (string-append
           "<char code='"
           (number->string (char->integer value))
           "'>"
           (xml-body-escape-string (object->string value))
           "</char>"))
        (else
         (string-append
           "<text>"
           (xml-body-escape-string (object->string value))
           "</text>")))))


Sent from my iPad

On Jun 17, 2010, at 11:00 AM, Josef Wolf <address@hidden> wrote:

Hello,

I am trying to write a (simple) function to convert s-expressions to XML.
I've come up with following function, which (somehow) works:

 (use-modules (ice-9 rdelim))
 (use-modules (ice-9 pretty-print))

 (define atom?
   (lambda (x)
     (and (not (pair? x)) (not (null? x)))))

 (define (indent string count)
   (if (< count 1)
       ""
       (string-append string (indent string (- count 1)))))

 (define (walklist expr level)
   (if (null? expr)
       '()
       (begin
         (sexp->xml (car expr) (+ level 1))
         (walklist (cdr expr) level))))

 (define (unknown->string expr)
   (let ((s (open-output-string)))
     (display expr s)
     (get-output-string s)))

 (define (sexp->xml expr . params)
   (let ((level (if (null? params) 0 (car params))))
     (cond
      ((atom? expr)
       (simple-format #t "~A~A\n" (indent "  " level) expr))

      ((list? (car expr))
       (sexp->xml (car expr) (+ level 1))
       (walklist (cdr expr) level))

 ;     ((pair? expr)
 ;      (simple-format #t "~A~A\n" (indent "  " level) (car expr))
 ;      (simple-format #t "~A~A\n" (indent "  " level) (cdr expr)))

      (#t
       (let ((s  (unknown->string (car expr))))
         (simple-format #t "~A~A~A~A\n" (indent "  " level) "<" s ">")
         (walklist (cdr expr) level)
         (simple-format #t "~A~A~A~A\n" (indent "  " level) "</" s ">"))))))


 (sexp->xml '(person
              (name "myself")
              (address
               (street "somestreet" 2)
               (town 1234 "thistown")
               (country "wonderland")
               (test 'a b)
;                (test1 . asd)
               (test2 '(asd "asd")))))


While this seems to work, it has some drawbacks, which I can not figure out
how to get rid of them:

- It doesn't work on pairs. When I uncomment the case which checks for pairs,
  the result is no longer XML. Instead, it looks more like the output of
  (display)

- There is no way to tell symbols from strings in the XML output.

- I'd like atoms to be enclosed into its tags without any whitespace. While
  it is easy to get rid of the indentation and the trailing newline, I can't
  figure out how to get rid of the newline that comes behind the opening tag

Any hints?

BTW: In addition, I'd appreciate any hints how to make this thing more
    scheme'ish. I think there is lots of potential for improvement here.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]