Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of v

pan-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of v

From:	Ron Johnson
Subject:	Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?)
Date:	Sat, 04 Jul 2009 12:17:45 -0500
User-agent:	Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.19) Gecko/20090103 Thunderbird/2.0.0.19 Mnenhy/0.7.6.666

On 2009-07-04 10:37, Duncan wrote:

Ron Johnson <address@hidden> posted
address@hidden, excerpted below, on  Fri, 03 Jul 2009 21:56:36
-0500:
Also (and maybe because I'm a DBA), this problem just *screams* for
SQLite and a database in the "First Normal Form".
[ OK, this is a very long post, I know (tho I haven't counted the lines,200? 250? More? I'll let pan show me that when I post and download it).But reading it and following even a few of the included tips shouldvastly improve your pan experience. =:^) Following all of them... well,that's up to you, but it works well for me! ]
Actually, before the C++ rewrite (the original was C coded) and thechanges that allowed pan to scale to millions of headers/overviews pergroup from 100k, Charles' plan was, for quite some time, to eventuallyswitch to just that, an sqlite backend.
I don't know why he didn't, except that in the 3-ish years during whichpan seemed to be abandoned that we later learned he used at least part ofto do the rewrite, several others (K. Haley I believe being one of them)began to experiment with pan, and some of those folks were database folks(I'm not sure if K. Haley is one of /them/). By the time Charlesannounced the C++ rewrite (aka new-pan, what we use now), there hadactually been some preliminary numbers posted to the pan-devel list, andI think that by using some of the data management techniques thatCharles /did/ use in new-pan, he actually got it to "reasonably" scale(now, it /does/ work when you throw even several million headers at it,with memory use scaling accordingly, before, 100k headers was bad, andabove 200k, pan would literally sit there for days, not really increasingmemory usage too badly, but just not getting anywhere -- it simply didn'tscale at all above 200k headers or so, memory or no memory), and thenumbers probably looked reasonably close to the preliminary databasenumbers as well -- at least close enough that he judged it not worth thetrouble, with the clear benefit of plain text files.
But, meanwhile, for those dealing with those huge groups, there's someusage patterns that work rather better than others, and thus some usagepatterns that users should avoid in the large groups, if they want areasonably working pan.
# 1 most important, particularly since pan is a GNOME family app and asmany Ubuntu users can attest, PAN AND THE GNOME ASSISTIVE TECHNOLOGIESAPPLET DO NOT GET ALONG WELL AT ALL!!! When that applet is running, itapparently polls /something/ often enough to keep pan from makingefficient progress at header sorting, in particular. What mightotherwise take 30 seconds or maybe two minutes (still long enough), endsup taking half an hour... two hours... more... So if you're runningthat, do yourself a favor and at LEAST shut it off when running pan.Either that, or switch to something other than pan, as the two simplydon't get along. For more details, see the list archives.

How do I tell if the GAT applet is running? (Using Debian, I don't*think* it is because I don't see it in the Tray, but want to be sure.)


[snip]

Still, while that's the way that works best for me, it's obviously noteveryone's style, or pan would default to downloading to cache, insteadof the download and save default it currently has. But that's why Ilisted these three tips separately and marked them as distinctlyoptional. It does work well, but it's not for everybody. Meanwhile, ifpeople just use tips 1-9, or even just 1 and 3 mainly, it'll likelyimprove their experience dramatically, even if they don't choose to dothe whole separate pan instances, huge cache, download-to-cache, then gothru and save, thing.

Let's say I increase the cache, and then download to cache. Howthen do I "save *from* cache", converting from "yenc" to binary?

Also, would increasing the cache (and then politely restarting pan)profit me any if I already have a large number of articles in the"save queue"?


--
Scooty Puff, Sr
The Doom-Bringer

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Big XML files... (was Re: [Pan-users] Re: Better processing of very large groups?), (continued)

Prev by Date: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?)
Next by Date: Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?)
Previous by thread: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?)
Next by thread: Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?)
Index(es):
- Date
- Thread