[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [fluid-dev] Multi core support not so great

From: josh
Subject: Re: [fluid-dev] Multi core support not so great
Date: Mon, 28 Sep 2009 13:46:02 -0700
User-agent: Internet Messaging Program (IMP) H3 (4.1.6)

Quoting David Henningsson <address@hidden>:
address@hidden skrev:
I finished implementing a first pass at multi-core support.

Oh, now I really must order a multi-core computer ;-)

Yeah, I want one now too! :)

While it
was a fun task, it didn't really yield the kind of performance I was
hoping for.  For those interested here is a description of the current

Added a synth.cpu-cores setting.

Perhaps "synth.workerthreads" would be more clear.

I still think cpu-cores sounds fine.. Its the naming convention that Renoise uses for example.

Additional core threads are created in new_fluid_synth()
(synth.cpu-cores - 1).
Primary synthesis thread signals secondary core threads when there is work.
Primary and secondary synthesis threads process individual voices in
Primary thread mixes all voices to left/right, reverb and chorus buffers.

Having multiple cores really just gives you the ability to have more
voices in the case of live performance (before maxing the CPU) or
*should* make your -F (fast MIDI render to file) operations go faster.
 The reason I say *should* is because it really depends on how complex
the MIDI file is.  If there aren't a lot of voices, it may in fact be
slightly worse performance.  Best case I have seen so far was about a
20% increase in speed (for the -F render case), which is something.
Interestingly the 2 cores were still not quite maxed.

Two things come in mind:

1) does the audio buffer size matter for the performance in this case?

Do you mean the FLUID_BUFSIZE (64) compile time size? That matters for both multi-core and single core. I don't think it necessarily is more of a concern in the multi-core case.

2) if you run single-threaded, is one core maxed?

Well, close to it.  I should double check, good point.

One issue that I have stumbled upon, is in regards to thread
priorities.  We want the secondary core threads to be running at the
same priority as the primary synthesis thread, for round robin sort of
response (though it may not matter that much if they are on separate
CPUs).  In the case of -F fast rendering you definitely don't want your
processes running high priority (especially on Linux).  In the live case
though, the audio driver will be running high priority, so you want the
secondary core threads also running high priority.  The issue is, that
currently the secondary core threads are created in new_fluid_synth(),
while the synthesis "thread" is created by audio drivers or via other
means.  There needs to be some way to ensure that the secondary threads
end up having the same priority.  Any ideas?  Perhaps a one time
creation of the secondary threads within the fluid_synth_one_block
routine and an attempt to make them identical in priority, would make

I can imagine starting threads must have a higher upper bound in time
than e g malloc, so starting them from fluid_synth_one_block will
probably lead to an underrun (at one time only, but still). I would
prefer to have the audio drivers create the additional threads. After
all, they are the ones who know how to create threads with the right
priority, right?

You're right about the issues of starting up threads (even if one time). Your suggestion sounds like the right thing to do. Its not yet apparent to me though how to correlate the cpu-cores setting, the resulting memory allocations related to that setting and the secondary core thread creation. What happens if FluidSynth is embedded in another application, for example. If that application is handling the audio itself, no worker threads would be created. Perhaps thats OK and it should be left up to the application to create the additional core threads, which would require additional API.

In summary:
I realized through all this, that optimization is probably more
important than multi-core support.  Enabling multi-core support
introduces additional overhead, so unless you are trying to get more
voices in the realtime case or render MIDI files slightly faster, you're
better off not enabling it.

So now that I learned my lesson.  Should I commit the code? ;)  Does it
seem worth it?  At the moment there may be some very minimal additional
overhead in the single core case (compared to before), but that is
probably so minimal as to be lost in the noise.

If all overhead is passing through some "if (cpucores > 1)" lines, I
would say it's nothing to worry about at all.

I think you should commit it, but call it an experimental feature at
this point. It could be a good ground for better multi-threading in the

I was previously using a mutex to lock the core parameters. This required a pair of lock/unlock instructions per voice, which I think was a significant amount of CPU consumption. I'm switching to a lockless model, where the lock is only held when worker threads are checking and then waiting for work. So only one pair of lock/unlocks per fluid_synth_one_block() per thread. That should help minimize the overhead and hopefully give better results.

I forgot that SCHED_FIFO processes continually run until they sleep, yield or a higher priority process becomes runnable. This oversight locked up my machine, since the primary thread wasn't able to process the audio buffers of the continually running secondary thread. Probably shouldn't be running SCHED_FIFO to begin with, when debugging stuff ;)

Which reminds me of another feature that should be implemented. I think we should add FluidSynth settings to enable/disable RT capabilities and set priority levels. I'm thinking something like this:
audio.realtime = yes/no
audio.realtime-prio = 1-99
midi.realtime = yes/no
midi.realtime-prio = 1-99

What do you think?  Might not apply on certain architectures.

I'm thinking about pre-rendering a few buffers for every voice,
with some rollback in case the voice should change (i e note-off events,
modulators etc). That would bring more stability, but it is a very
long-time goal and is nothing I plan to implement in the near future.

Btw, if you're up to implementing features, I would suggest giving the
libaudiofile / libsndfile a go. "How do I make a wave file" is an issue
coming up every now and then.

I was planning of working on that next. Especially since you forwarded me that wav patch recently. Using libsndfile or libaudiofile makes the most sense, for the added flexibility. Could render to floating point too for that matter. I would probably just add the feature to the -F fast render feature, though I suppose the aufile driver might also benefit, though it seems less useful now.

// David


reply via email to

[Prev in Thread] Current Thread [Next in Thread]