pan-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pan-devel] Pan hung during newsgroup processing (Re: My list of present


From: SciFi
Subject: [Pan-devel] Pan hung during newsgroup processing (Re: My list of present problems with Pan.)
Date: Sat, 22 Oct 2011 18:11:05 +0000 (UTC)
User-agent: Pan/0.135 (Tomorrow I'll Wake Up and Scald Myself with Tea; GIT d8bfcda (github.com/judgefudge/pan2/master); x86_64-apple-darwin10.8.0; gcc-4.2.1 (build 5666 (dot 3)); 32-bit mode)


Hi,

On Tue, 18 Oct 2011 03:53:23 +0000, Duncan wrote:
> 
> SciFi posted on Mon, 17 Oct 2011 20:13:48 +0000 as excerpted:
> […]
>> 5.  Before accepting a "version 1.0" of Pan, I have written requests
>> over the years to have more "multi-threading" incorporated into the
>> program.
> 
> What's more distressing for me, here, is that particularly when one 
> decides to cache decades worth of pan list history, etc, on cold-system-
> cache as when first opening pan, it'll freeze not only itself, but the 
> whole of X, for noticeable periods!  I can move the mouse, and anything 
> already in memory seems to work, but all four spindle lights are on, on 
> the RAID-1, indicating the system is accessing four separate things in 
> parallel, and nothing responds until whatever it is is finished (still 
> some time before pan actually shows up, but the RAID indicators go back 
> to their more normal pattern, 1-4 blinking, not all four on solid).
> 
> AFAIK that's a kernel and hardware platform I/O thing to some degree, not 
> entirely pan's fault by any means (in reality, the hardware and kernel 
> shouldn't even allow that sort of hogging), but I believe it should be 
> possible to program pan not to hit the system that hard.  I don't believe 
> I've seen anything else do that since I went dual-dual-cores and 4-
> spindle md/RAID-1.
> 
>> a)  For example, when I open a newsgroup known to be huge with binary
>> files, everything is IMMEDIATELY STOPPED from going forth, _including_
>> the download decoding queue which I thought was already a separate
>> thread -- I can see such stoppage by visual observation of my modem's
>> lights: they stop blinking while a newsgroup is being opened.  When the
>> newsgroup is finally opened, the other functions continue from where
>> it/they were stopped.
> 
> AFAIK, my experience above may shed some light on that.  Is the rest of 
> the system interactive?  In particular, can you start a new app or open a 
> new file that hasn't been accessed yet this boot (or since you did drop-
> caches or the equivalent on OSX), so it's not in system-cache yet?  
> 
> If not, then it's I/O starvation of the entire system, not simply pan-
> threading, altho as I said, it should be possible to code pan not to hit 
> the system so hard, tho this is rather far from a solved issue in system 
> I/O circles, for sure.
> 
> It's worth mentioning, BTW, that such lockups are rather more common on 
> single-disk/single-core systems, and at least Linux itself has gotten 
> better about it over the years.  It's significantly harder to do on a 
> quad-disk Linux md/RAID-1 running on quad cores, than on a single-core 
> single-disk system.  (It's also worth noting that this is a benefit of 
> kernel RAID-1 as well, for read-only and read-mostly loads at least, as 
> opposed even to RAID-0.  I was quite surprised at how well the kernel 
> actually does in parallelizing access on the RAID-1 as opposed to RAID-6 
> and even RAID-0.  System read-contention went down DRAMATICALLY, with a 
> corresponding increase in responsiveness and speed.  But part of that was 
> very likely due to the defrag effect of copying the existing system over 
> to the new layout.  Honestly, I haven't done a mkfs and recopy from 
> backup on the appropriate md/RAID since I downloaded the full gmane 
> archive for several lists, including pan.user, and I really should, and 
> see how much difference /that/ makes, before I complain too loudly.)
> 
> The point is, it's not necessarily entirely pan, at least from the data 
> you've presented.  Part of it is likely disk/hardware I/O and kernel I/O 
> scheduling issues.
> 
> Meanwhile, it's also worth noting that the rewrite to make pan disk-based 
> rather than memory based would very likely solve this problem as part of 
> the package.  That came up on -user again, recently, connected as usual 
> with the switch-to-sqlite-for-the-backend proposal.
> 
> Thus, in reality, we're back where we were before the C++ rewrite.  The 
> efficiencies of the new format did indeed buy pan some scalability in 
> terms of memory, and a few years of calendar time, but we're back with 
> the same problem, and the same proposed sqlite-database based disk-based 
> solution, as opposed to the huge-in-memory tree that pan has used to this 
> point, that we were discussing back before the rewrite.
> 
> I guess the good thing is that sqlite is rather better and more robust 
> now, what with firefox's sqlite adoption and usage in the mean time, and 
> the experience with the pan rewrite probably means pan will avoid certain 
> implementation mistakes, as well.
> 
> But it still needs done...

Oh boy, so much to talk about here…

Remember that OSX is based on FreeBSD and the MachO kernel.
It is not the age-old Mac system, but a modernized *ix system as a base.

Firstly, I definitely know the temporary-hang is for _that_ task _only_ .
I happen to run 3 Pan tasks under X11 (XQuartz) plus the other subtasks
that XQuartz launches (quartz-wm, a copy/paste subtask, etc.).  When one
is loading/saving/whatever a big newsgroup header-list and gets hung, all
other X11 tasks are functioning adequately plus all other OSX tasks are
still running (e.g. EyeTV is still recording my TV shows, I can still
browse websites, type these notes, etc.).

Secondly, even tho my iMac has the Core2Duo chip, it is not hung based on
which 'core' the task is 'on'.

Thirdly, I use several FireWire800 disk drives, one which has my
PAN_HOME directories.  During a Pan hang, that disk is also able to
process other info from/to other tasks.  Also the FW channel is not hung,
thus all other disk drives are able to function adequately.
(As an aside, OSX's implementation of FW still has an old bug, where a
buss problem could hang the entire system.  I'm still seeing such
rare occurrences of it.  But do know I would be able to tell such
hangs apart from a single Pan-task hang.  ;)  )

Fourthly, I also said the Pan-downloading thread is stuck as well, when
this happens in the same task.  I thought the downloading queue was a
separate thread -- multi-threading being allowed inside this rewritten
Pan version "this far" (I know some of the history behind this).  My
setup includes using a RamDisk (TMPDIR) which the Pan-uudeview code uses
as an interim processing area, and putting the decoded files on the main
OSX disk (not external disk).  What I'm driving at, here, is that none of
the devices are being "hogged" by the hung Pan task -- it is strictly the
design of Pan's newsgroup processing that is clogging things up -- and I
am wondering how GTK treats multi-threading, it doesn't seem to be much
to allow the base o.s. multi-threading thru for full effect.

Fifthly, I have only a bit of SQL experience, coming from testing MythTV
(it uses MySQL for a scheduling database among its other functions).
Eventually I would not want to run multiple versions of SQL for whatever
reason -- SQL was suppose to be a "standard", with us being able to pick
an app that would "run" it, and any app should be compatible with another
app as long as they all support the "standard".  But I am seeing this
more & more as a disaster -- I cannot run sqlite in place of mysql such
that to make MythTV happy, for example.

(FWIW much of all this applies to whatever platform I end-up building to
replace this fruit.  ;)  )

I was sure my (old) writeups on this problem had adequately described
these ramifications.  ;)  For brevity I'll leave my blabbling here.  :)






reply via email to

[Prev in Thread] Current Thread [Next in Thread]