chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Parsing HTML, best practice with Chicken


From: Alex Shinn
Subject: Re: [Chicken-users] Parsing HTML, best practice with Chicken
Date: Tue, 30 Dec 2014 10:18:33 +0900

On Tue, Dec 30, 2014 at 3:47 AM, mfv <address@hidden> wrote:
Hello,

> I somehow always manage to get it working with sxpath when I need to do
> some web scraping, but it's somewhat painful.

Thanks, I will have a look at sxpath.


> >  Are there any packages like Python's Beautifulsoup in the Chicken
> > arsenal?
>
> That sort of thing is sorely lacking.  There's a promising "zipper"
> library written by Moritz Heidkamp, but so far it's unreleased and
> undocumented.  If you're feeling very adventurous you could have
> a look at it: https://bitbucket.org/DerGuteMoritz/zipper

Pity. I will have a look at the BeautifulSoup source. Maybe I can copy/mimic some
sort of its functionality.

html-parser is intended to be the parsing side of BeautifulSoup.
The idea is to do one thing well, and leave it up to other libraries
to do matching and extraction.  As Peter says, matchable can be
cumbersome here because it doesn't do unordered matching.

If you find any bugs or surprising behavior in html-parser please
let me know.

-- 
Alex


reply via email to

[Prev in Thread] Current Thread [Next in Thread]