pan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of v


From: CSV4ME2
Subject: Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?)
Date: Sun, 5 Jul 2009 15:51:22 +0200
User-agent: KMail/1.9.10

On Sunday 05 July 2009, Steven D'Aprano wrote:
> On Sun, 5 Jul 2009 11:39:48 am Ron Johnson wrote:
> > On 2009-07-04 17:21, CSV4ME2 wrote:
> > > On Saturday 04 July 2009, Ron Johnson wrote:
> > >> On 2009-07-04 13:57, Matej Cepl wrote:
> >
> > [snip]
> >
> > >>>                             I don't trust any email client which
> > >>> saves anything into SQLite ;-)
> > >>
> > >> SQLite is "just" the obvious choice.  What happened to c-trieve,
> > >> or any of the other b+tree libraries?
> > >
> > > No it isn't:
> > > - nothing beats processing dedicated in-core data structures wrt to
> > > speed
> >
> > Your CompSci professor wants to back, to fail you in Data Structures
> > class...
>
> Ha! You fail! *wink*
>
> CSV4ME2 didn't say anything about making "a linear search thru [sic] a
> large in-memory array". Read what he said more carefully:
>
> "nothing beats processing dedicated in-core data structures wrt to
> speed".
>
> No mention of linear searching. Hash tables get O(1) searches, binary
> trees get O(log N), as do binary searches through an array. And if
> they're in memory, you don't have to wait for disk IO which is two
> orders of magnitude slower than memory IO.
>
> > A linear search thru a large in-memory array is *much* slower than
> > an indexed search of an ODS (on-disk structure, like a b-tree or an
> > inverted list).  Especially if the OS has buffered that ODS into
> > core.
>
> If the entire ODS can fit in memory, and you don't need persistence,
> then why bother writing it to disk?
>
> Of course, if you do need persistence, that's a good reason. But if you
> don't need ACID compliance, why pay the overhead of ACID compliance?
> Just serialise the data structure to disk as needed, keeping the old
> one behind as backup.

Steven,

Yes, my statement was all about access speeds, nothing about (not so) cleverly 
devised algorithmes/data structure combinations. 

Thanks for the passing grade :-)

C




reply via email to

[Prev in Thread] Current Thread [Next in Thread]