Ron Johnson <address@hidden> posted
address@hidden, excerpted below, on Fri, 03 Jul 2009 21:56:36
-0500:
Also (and maybe because I'm a DBA), this problem just *screams* for
SQLite and a database in the "First Normal Form".
[ OK, this is a very long post, I know (tho I haven't counted the lines,
200? 250? More? I'll let pan show me that when I post and download it).
But reading it and following even a few of the included tips should
vastly improve your pan experience. =:^) Following all of them... well,
that's up to you, but it works well for me! ]
Actually, before the C++ rewrite (the original was C coded) and the
changes that allowed pan to scale to millions of headers/overviews per
group from 100k, Charles' plan was, for quite some time, to eventually
switch to just that, an sqlite backend.
I don't know why he didn't, except that in the 3-ish years during which
pan seemed to be abandoned that we later learned he used at least part of
to do the rewrite, several others (K. Haley I believe being one of them)
began to experiment with pan, and some of those folks were database folks
(I'm not sure if K. Haley is one of /them/). By the time Charles
announced the C++ rewrite (aka new-pan, what we use now), there had
actually been some preliminary numbers posted to the pan-devel list, and
I think that by using some of the data management techniques that
Charles /did/ use in new-pan, he actually got it to "reasonably" scale
(now, it /does/ work when you throw even several million headers at it,
with memory use scaling accordingly, before, 100k headers was bad, and
above 200k, pan would literally sit there for days, not really increasing
memory usage too badly, but just not getting anywhere -- it simply didn't
scale at all above 200k headers or so, memory or no memory), and the
numbers probably looked reasonably close to the preliminary database
numbers as well -- at least close enough that he judged it not worth the
trouble, with the clear benefit of plain text files.
But, meanwhile, for those dealing with those huge groups, there's some
usage patterns that work rather better than others, and thus some usage
patterns that users should avoid in the large groups, if they want a
reasonably working pan.
# 1 most important, particularly since pan is a GNOME family app and as
many Ubuntu users can attest, PAN AND THE GNOME ASSISTIVE TECHNOLOGIES
APPLET DO NOT GET ALONG WELL AT ALL!!! When that applet is running, it
apparently polls /something/ often enough to keep pan from making
efficient progress at header sorting, in particular. What might
otherwise take 30 seconds or maybe two minutes (still long enough), ends
up taking half an hour... two hours... more... So if you're running
that, do yourself a favor and at LEAST shut it off when running pan.
Either that, or switch to something other than pan, as the two simply
don't get along. For more details, see the list archives.