[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnash-dev] IO performance

From: Richard Wilbur
Subject: Re: [Gnash-dev] IO performance
Date: Fri, 05 Jun 2009 16:25:20 -0600

On Mon, 2009-06-01 at 23:40 +0200, strk wrote:
> Profiling of Gnash by Andrea on Amiga OS brought back
> to my attention performance of Gnash IO operations.
> He found out that the 100% CPU usage there is due
> to the far too many small file reads Gnash performs.
> This is a known issue I tried to deal with a few times
> in the past w/out committing any big improvement yet,
> but I'm sure there's a lot to do.
Is there enough to keep more than one person busy?  I love optimizing
things and am more than happy to get my hands dirty in the code but
hadn't jumped too deep into this codebase, yet.

Otherwise, my trunk PPC build of gnash is painfully slow on some
animations and I would be willing to test.

> The first problem is the *need* we have for small reads
> from IOChannel (formerly known as tu_file) from the Gnash
> parsers (SWF and FLV). The need arises from the parser model
> which is a pull parser, fetching even bits by bits for the SWF
> case.
> A first approach at reducing the problem was the caching gnash::stream
> patch which you can find here: 
> That patch probably doesn't apply to trunk anymore, but the idea
> was that at least the SWF parser would fetch full tags and keep
> them in memory.
Are you interested in having someone the above-mentioned patch to apply
to trunk or do you think there is now a better way to implement this?

> Another spot we may optimize are the actual ::fread calls
> in the curl and filedescriptor adapters. They currently read
> in chunks of 1 byte, but could easily be switched to read
> in bigger chunks, namely as many as requested by caller.
> This might not make a big difference until the callers actually
> ask for bigger chunks, but it's a start.
I worked with this kind of problem on some embedded systems and,
properly applied, the solution you propose above made a huge difference
in efficiency.  The overhead of going through the I/O system and file
system stack to fetch small bits of information is very large.

> Yet another approach would be using mmap and let the system
> do its best to make reads faster.
> I guess the main problem here would be profiling as gprof won't
> notice the amount of time spent there, similarly to how allocation
> costs went unnoticed in the ninja case.
Sounds like FPS and time spent in stream input calls might be useful.

> If anyone is ready to profile patches I'm willing to give some
> for evaluation.
> --strk; 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]