pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-users] Can creating task list be optimized?


From: Duncan
Subject: [Pan-users] Can creating task list be optimized?
Date: Sun, 27 Aug 2006 00:19:48 +0000 (UTC)
User-agent: pan 0.109 (Beable)

Try this with 0.109.

Go to a group with many single-part binaries, say a picture group, a few
tens of thousands of posts.  Here, I tried one with 23,820 unread posts
according to pan, plus some downloaded/read already.

Select-all headers and try to download.

Here, pan sits for over an hour, figuring out and adding a single post at
a time to tasks.nzb, less than ten a second.  This is on Gentoo/amd64,
dual Opteron 242 w/ 8 gig of memory.  Pan's memory use is actually rather
modest, ~80 meg, the entire time.  Little disk activity beyond the
periodic write of the updated tasks.nbz from cache to disk.  Pegging one
CPU at 100% (single thread, other CPU nearly idle), 80-ish percent of the
single CPU is pan application mode usage, 15-ish percent system mode usage.

Attaching the pan process with strace -efile shows cycles of:

stat the working dir
open a tasks.nzb.tmp file for write-only
unlink tasks.nzb
rename the tasks.nzb.tmp to tasks.nzb
repeat

single-digit cycles per second, 4 with the strace so figure an even 10 per
second without that slow-down.  Even at 10/sec, 23,820 posts will take
nearly 2400 seconds, 40 minutes, to add.

watch ls -l tasks.nzb shows a slowly increasing filesize of several MB. 
Repeatedly opening the tasks.nzb file in an editor demonstrates that
indeed, new content is being added (roughly 10 lines an entry), so pan
isn't stuck looping on the same header, it's just loading them into the
tasks list at less than ten a second, so it'll take an hour to load all
the headers!

I have a 6Mbit download pipe, and as I said, these are all relatively
small single-part binaries, so pan will probably download them
faster than it was able to create the task list!  (If it would ever
finish creating the list and give me a size/time estimate, I could
confirm.  I'm posting this while I wait.)  That's RIDICULOUS!

$wc -l tasks.nzb
239118 tasks.nzb

...and still going...  Hey... it stopped and the download started!

Unfortunately... pan is apparently single-threaded on the download as
well, and with all those little posts, is /again/ pegging a single CPU,
while the other one sits idle and the download limps along at a measly
150Kbyte/sec (6Mbps ~768KByte/sec) or less (that's the top of the graph).

nzb is xml format.  It's nice to have the nzb files, but xml is definitely
not known for its efficiency in parsing, and it shows!  BIG TIME it shows!

Looks like pan can be pretty efficient on large multipart posts,
presumably several MB per nzb entry, and it does OK on text posts as there
aren't so many of them, but get several tens of thousands of small
binaries, and it just doesn't work well at all, with the current tasks.nzb
anyway.

Additionally, pan needs to be multithreaded, at least UI in one thread,
while everything else is in another, giving the UI back some
responsiveness.  With dual CPUs, I'm not /used/ to having a non-responsive
app unless it's crashed, any more, at least not unless the system is in
i/o-wait, not only running the disk momentarily every few seconds.

Me oh my... still 21,600 tasks, still averaging less than 50 KByte/sec,
over a gig and a half to d/l still.  8-9 hours at /this/ rate.  It'll be
awhile!

Maybe I better remerge klibido, if it's going to be doing /this/.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman





reply via email to

[Prev in Thread] Current Thread [Next in Thread]