[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of v
From: |
Ron Johnson |
Subject: |
Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?) |
Date: |
Sat, 04 Jul 2009 22:50:38 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.8.1.19) Gecko/20090103 Thunderbird/2.0.0.19 Mnenhy/0.7.6.666 |
On 2009-07-04 21:56, Steven D'Aprano wrote:
On Sun, 5 Jul 2009 11:39:48 am Ron Johnson wrote:
On 2009-07-04 17:21, CSV4ME2 wrote:
On Saturday 04 July 2009, Ron Johnson wrote:
On 2009-07-04 13:57, Matej Cepl wrote:
[snip]
I don't trust any email client which
saves anything into SQLite ;-)
SQLite is "just" the obvious choice. What happened to c-trieve,
or any of the other b+tree libraries?
No it isn't:
- nothing beats processing dedicated in-core data structures wrt to
speed
Your CompSci professor wants to back, to fail you in Data Structures
class...
Ha! You fail! *wink*
CSV4ME2 didn't say anything about making "a linear search thru [sic] a
large in-memory array". Read what he said more carefully:
"nothing beats processing dedicated in-core data structures wrt to
speed".
I know...
No mention of linear searching. Hash tables get O(1) searches, binary
trees get O(log N), as do binary searches through an array. And if
they're in memory, you don't have to wait for disk IO which is two
orders of magnitude slower than memory IO.
Maybe it's the pedant in me, but he made no mention of the type of
algorithm, so to make point, I chose an example demonstrating that
*in and of itself*, putting the data structures in memory does not
*guarantee* good performance.
"All else being equal", though, yes it does.
A linear search thru a large in-memory array is *much* slower than
an indexed search of an ODS (on-disk structure, like a b-tree or an
inverted list). Especially if the OS has buffered that ODS into
core.
If the entire ODS can fit in memory, and you don't need persistence,
then why bother writing it to disk?
Because you are *never* guaranteed that your data structures will
fit in RAM, and that the user will have lots of RAM and a multi-core
CPU.
Using a well-indexed structure means that the app doesn't have to
continually copy/rename/delete tasks.nzb. Performance will be
maintained because the OS will buffer most of the ODS, so you'll
only be writing back dirty pages instead of serializing the whole
tasks.nzb.
THIS FACT IS THE genesis of this whole long thread: currently on my
system, pan is copy/rename/deleting a 370MB tasks.nzb every 90 seconds.
Of course, if you do need persistence, that's a good reason. But if you
don't need ACID compliance, why pay the overhead of ACID compliance?
Just serialise the data structure to disk as needed, keeping the old
one behind as backup.
As you can see from this attachment, Pan is using 2,5GB RAM, 32% of
core. If I were using a more typical 4GB, 2GB or even 1GB, Pan
would be thrashing my system to death.
(Are you people reading this through Pan seeing my attachments?)
--
Scooty Puff, Sr
The Doom-Bringer
top - 22:32:45 up 16 days, 10:23, 1 user, load average: 1.82, 1.90, 1.93
Tasks: 189 total, 3 running, 186 sleeping, 0 stopped, 0 zombie
Cpu(s): 84.2%us, 14.9%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.5%hi, 0.5%si, 0.0%st
Mem: 8177796k total, 8125156k used, 52640k free, 278360k buffers
Swap: 0k total, 0k used, 0k free, 3974552k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15760 me 20 0 2610m 2.5g 5864 R 89 31.9 2447:41 pan
28872 me 20 0 687m 321m 14m S 12 4.0 1086:44 firefox-bin
27533 root 20 0 423m 269m 7432 S 0 3.4 111:00.45 Xorg
28060 me 20 0 397m 217m 18m S 1 2.7 18:03.99 icedove-bin
10929 root 20 0 47648 31m 1260 S 0 0.4 2:55.08 console-kit-dae
29007 me 20 0 70532 31m 1308 S 0 0.4 51:34.65 hellanzb
2475 root 20 0 32200 27m 2376 S 0 0.3 0:04.28 spamd
6210 root 20 0 32080 27m 2236 S 0 0.3 0:06.58 spamd
5556 me 20 0 55380 26m 4284 S 0 0.3 4:26.18 gqview
4506 root 20 0 29568 23m 1108 S 0 0.3 0:41.83 spamd
30608 me 20 0 39864 22m 1716 S 0 0.3 0:13.15 urxvt
27649 me 20 0 83516 12m 5976 S 0 0.2 14:45.63 gnome-panel
27650 me 20 0 82056 9828 4940 S 0 0.1 0:24.91 nautilus
27696 root 20 0 12504 9140 716 S 0 0.1 0:00.23 SystemToolsBack
27656 me 20 0 28172 8216 2492 S 0 0.1 0:00.30 system-config-p
13242 me 20 0 36984 7524 988 S 0 0.1 0:02.01 urxvt
27689 me 20 0 56048 7500 4460 S 0 0.1 0:11.77 gweather-applet
- [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), (continued)
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Ron Johnson, 2009/07/05
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), walt, 2009/07/05
- [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Matej Cepl, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Ron Johnson, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), CSV4ME2, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Ron Johnson, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Steven D'Aprano, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?),
Ron Johnson <=
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), CSV4ME2, 2009/07/05
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Steven D'Aprano, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Ron Johnson, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Steven D'Aprano, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Ron Johnson, 2009/07/04
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Steven D'Aprano, 2009/07/05
- Re: [Pan-users] Re: Big XML files... (was Re: Re: Better processing of very large groups?), Ron Johnson, 2009/07/05