savannah-hackers-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-hackers-public] Re: GNU Planet and Savannah


From: Sylvain Beucler
Subject: [Savannah-hackers-public] Re: GNU Planet and Savannah
Date: Mon, 12 Jan 2009 08:23:31 +0100
User-agent: Mutt/1.5.18 (2008-05-17)

On Sun, Jan 11, 2009 at 05:30:58PM +0100, address@hidden wrote:
> 
> Hi Sylvain.
> 
>    Quick stat:
>    sv_sv:~# grep 'GNU Planet' /var/log/apache2/access.log| wc -l
>    216844
>    (out of ~3,000,000 hits, so around 6%)
> 
>    That's for a single week!  This places 'GNU Planet' as the 3rd best
>    crawler, between msnbot and Slurp ;)
> 
>    Apparently this matches:
>    360 GNU projects * 4x per hour * 24h * 7d
>    241920
> 
>    Do you have an idea on how to make this more efficient?
> 
> By reducing the period of the fetching, maybe:
> 
> (* 360 24 7) 60480
> 
> Nacho, what do you think?

Maybe I could provide a Sitemap (http://www.sitemaps.org/protocol.php)
with 'last modified' fields, and you'd only grab newer/changed items?

I need to check if edited news items do get a newer 'last modified'
date though - I saw that you overwrite edit news items, which is good.

-- 
Sylvain




reply via email to

[Prev in Thread] Current Thread [Next in Thread]