[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-devel] Re: ancient DB schema

From: Calin A. Culianu
Subject: Re: [Pan-devel] Re: ancient DB schema
Date: Tue, 8 Jun 2004 09:51:47 -0400 (EDT)

These are all great ideas... but as I said in a previous post -- I am
going to just take a stab and making article header loading scale better
to large numbers of headers (my goal is for even 2 million headers to work
as well as 10,000). 

Anyway, you touched upon worries that it's bad for the db schema to have 
to constantly change.  You kind of alluded to the idea that it's good to 
finalize a data design now rather than later.  I beg to differ.  I think 
in this particular application, we have a lot of control over the user's 
database, and the data is extremely transient (namely, noone is going to 
care about articles that are 2 months old as they will have likely expired 
anyway from the servers).  Given these factors, it is not such a 
show-stopper if we have to constantly change the db schema to support new 
features or a new design.. that is all stuff under our control, after 
all.. and the worst case scenario is that a user has to throw out his/her 
downloaded article headers and re-download.  In my opinion that isn't so 


On Fri, 4 Jun 2004, Duncan wrote:

> K. Haley posted <address@hidden>, excerpted below,
>  on Fri, 04 Jun 2004 18:56:11 -0600:
> > A newsgroup.
> > (it looks to me like pan treats folders the same as news groups. `folder`
> > is not needed if seperate FOLDER and FOLDER_ARTICLE tables are used.)
> The "looks like" statement is correct. Charles has stated that it's
> deliberate that folders are treated exactly the same as groups,
> simplifying the logic as there's only one set to deal with, not a separate
> set for folders and for groups.
> One gotcha is tracking read and unread articles.  PAN apparently uses the
> server added xref line for that purpose.  Since that's a server-local
> added header, a few server implementations don't add it, and folks have
> from time to time complained that messages on these servers don't get
> marked as read, properly.
> Additionally, at one point as an performance enhancing experiment, PAN
> tried not actually retrieving user-posted messages, but substituting the
> local copy instead.  This failed with the same mark-as-read problem,
> because the local copy didn't have the server supplied xref header. 
> Additionally, it was I believe decided that it was more important to see
> what actually posted, in case it got mangled or something, than the bit of
> speed saved by not downloading those articles (particularly since PAN
> doesn't do attachments so the articles aren't that long anyway).  I doubt
> this will be tried again, but it's one example of the practical effect of
> the tracking method.
> Thus, however read/unread is tracked, one might wish to keep it in mind
> when building the database record format, in case it changes at some
> point.  Ideally it would be tracked by msg-id, making the tracking a bit
> more server independent.
> Another thing to consider, however, is that some servers (my own ISP's
> included, as I see the complaints on the internal groups..  They are
> running High Winds serverware) have bugs when retrieving by msg-id only,
> so while it would be nice for PAN to have that functionality as an option,
> it should not be the ONLY way PAN can retrieve.  Retrieving by xref
> number, which is what I /believe/ PAN does now, should always be
> supported, so it can't be eliminated entirely.
> Back to folders.  It might be useful to have the ability to add old
> folders back in, at some point.  I wouldn't worry about this functionality
> now, except that we shouldn't break the eventual possibility by limiting
> the database record format.  Basically what I'm saying, is that at some
> point it would be nice to have a folder import function, that would either
> read the messages and build the individual fields from them as possible,
> or substitute reasonable defaults where the data cannot be gathered,
> including if the message doesn't contain a "required" header.  We
> shouldn't artificially limit the future ability to add this functionality,
> with our formatting choices now.  (I could give practical examples if
> needed but won't here, for brevity.)
> Just some random thoughts.. inspired by that "looks like a group"
> notation.  I'm not a db person so know not whether they pertain or not,
> but would hate to find out they did only AFTER the fact.  <g>
> -- 
> Duncan - List replies preferred.   No HTML msgs.
> "They that can give up essential liberty to obtain a little
> temporary safety, deserve neither liberty nor safety." --
> Benjamin Franklin
> _______________________________________________
> Pan-devel mailing list
> address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]