guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with sxml simple parser for the quicklisp importer


From: swedebugia
Subject: Re: Help with sxml simple parser for the quicklisp importer
Date: Wed, 23 Jan 2019 17:03:02 +0100

On 2019-01-23 15:22, Ricardo Wurmus wrote:
Hi,

(define (get-homepage name)
   "Get the latest meta release file. From the links in this we extract all
other information we need."
   (call-with-temporary-output-file
    (lambda (temp port)
      (and (url-fetch (homepage name) temp)
           (xml->sxml (get-string-all port))))))

Aside: you don’t need to use “get-string-all”; “xml->sxml” can read
directly from a port.

But it errors out with:

  sxml/simple.scm:143:4: In procedure loop:
Throw to key `parser-error' with args `(#<input: string 23c45b0>
"[GIMatch] broken for " (END . head) " while expecting " END link)'.

I fetched the document.  Here’s the part that it barfs on:

--8<---------------cut here---------------start------------->8---
<!DOCTYPE html>
<html>
<head>
   <meta charset="utf-8">
   <title>1am | Quickdocs</title>
   <link rel="stylesheet" type="text/css" href="/css/LigatureSymbols/style.css" 
/>
<link rel="stylesheet" type="text/css" media="screen" href="/css/main.css">

   <script type="text/javascript" src="/js/jquery-1.9.1.min.js"></script>
   <script type="text/javascript" src="/js/underscore-min.js"></script>
   <script type="text/javascript" src="/js/quickdocs.js"></script>
</head>
…
--8<---------------cut here---------------end--------------->8---

The second “link” tag opens but is never closed.  This may be valid
HTML, but it is not valid XML, which is what xml->sxml expects.

Thanks for the quick answer!
I will try to remove this line before handling over to the parser.

--
Cheers Swedebugia



reply via email to

[Prev in Thread] Current Thread [Next in Thread]