[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Big XML files... (was Re: [Pan-users] Re: Better processing of very larg

From: Ron Johnson
Subject: Big XML files... (was Re: [Pan-users] Re: Better processing of very large groups?)
Date: Fri, 03 Jul 2009 21:56:36 -0500
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv: Gecko/20090103 Thunderbird/ Mnenhy/

On 2009-07-02 18:53, Duncan wrote:
Ron Johnson <address@hidden> posted
address@hidden, excerpted below, on  Thu, 02 Jul 2009 13:14:20

Because giganews has such a long retention period, some groups can have
a very *large number* of messages.  If you subscribe to two or more of
them, you could run out of memory.

As it is, pan seems to sequentially scan thru all messages when marking
a group of them as Read.

There needs to be a better and less memory intensive method of handling
huge groups.  B-trees, hash tables, SQL-Lite, I don't know, but
*something* better than the status quo.

This is true, tho pan is far better than it used to be (it deals with multi-million messages now, where old-pan had problems with 100k).

One of the problems seems to be his use of big flat files. It's great for being able to peek into the inner working of tasks.nzb, but every time an article gets successfully downloaded, pan must make a copy of the file in order to get ride of that one article. If tasks.nzb is large, that takes a while.

Similar problems in groups/.

This reminds me of mbox files in the email world, and it's why Maildir (where each email is a separate file) is so much more efficient at doing things other than adding new emails to the end of the file.

Also (and maybe because I'm a DBA), this problem just *screams* for SQLite and a database in the "First Normal Form".

Scooty Puff, Sr
The Doom-Bringer

reply via email to

[Prev in Thread] Current Thread [Next in Thread]