[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

salutations and web scraping

From: Catonano
Subject: salutations and web scraping
Date: Fri, 30 Dec 2011 23:58:47 +0100

Hello people,

Happy New Year.

I´m a beginner, I never wrote a single line of LISP or Scheme in my life and I´m here for asking for directions and suggestions.

I´m mumbling about a pet project. I would like to scrape the web site of a comunitarian radio station and grab the flash streamed content they publish. The license the material is published under is Creative Common  so what I´m planning is not illegal.

The reason why they chose such an obtuse solution is because they are obtuse. They started the station in the 70s and now they don´t get this digital new thing

I read the web stuff. The client chapter suggests to adopt an architecture similar to that of the server for parallel scrapers and closes flashing the idea of threads and futures.

I don´t see how I could use threads or futures (I´m not even sure what they are) and my boldness is such that I´d ask you to write for me an example skeleton code.

Also I was thinking to write a scraper in Guile scheme and then such scraper would parse the html source for te relevant bits and then delegate the flash stuff to a unix command, I think wget, curl or something similar. Is this reasonable ? Is there any architectural glitch I´m missing, here ?

Don´t worry people, I know that the server setup and the internet connection is not so strong and I don´t want to be server hostile so I guess a maximum of 2 parallel connections are gonna run.

Or, I was dreaming I could try to integrate the thing with the Gnome enviroinment and make it available from the Gnome Shell _javascript_. So the people in the community could use it to grab the footages themselves. I don´t know

Thanks so much for ANY hint

reply via email to

[Prev in Thread] Current Thread [Next in Thread]