Re: [Chicken-users] html->sxml (html-parser egg) does not decode entiti

From: Andy Bennett
Subject: Re: [Chicken-users] html->sxml (html-parser egg) does not decode entities in html attributes, ide as why?
Date: Fri, 09 May 2014 00:26:51 +0100
User-agent: Trojita/0.4.1; Qt/4.8.2; X11; Linux; Debian GNU/Linux 7.4 (wheezy)


Empty attributes now seem to decode to the string "()".


Thanks! :-) That works for me now:

#;4> (html->sxml empty)
(*TOP* (div (@ (data "")) "empty"))

During " deserialisation when inside an attribute, we seem to get data from earlier in the stream introduced:

I couldn't reproduce this.  Could you check with the latest fix?

Which CHICKEN are you using? I can reproduce it with 0.5.2 on 4.9.0rc1:

#;5> (html->sxml content)
(*TOP* (br) "\r\n" (br) "\r\n" (div (@ (data "(sxml (@ (attr \"\r\nbr\r\nbr12345\"\r\nbr\r\nbr)) body)")) "div body"))

...but not with 0.5.2 on

#;4> (html->sxml content)
(*TOP* (br) "\r\n" (br) "\r\n" (div (@ (data "(sxml (@ (attr "12345")) body)")) "div body"))

With 0.5.3 on 4.9.0rc1 it seems to work:

#;5> (html->sxml content)
(*TOP* (br) "\r\n" (br) "\r\n" (div (@ (data "(sxml (@ (attr \"12345\")) body)")) "div body"))

...but perhaps it's worth chasing this down a bit further?

Thanks for all your help with this. :-)



