Re: [gnuspeech-contact] a Python wrapper for the TRM

gnuspeech-contact

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnuspeech-contact] a Python wrapper for the TRM

From:	Leif Johnson
Subject:	Re: [gnuspeech-contact] a Python wrapper for the TRM
Date:	Mon, 7 Mar 2011 12:41:50 -0600

Hi David -

Thanks for the feedback !

I am indeed aware that there's a lot more to speech than just the
articulatory synthesis. However, I'm working on some acoustic
modeling, and I've been casting about for an open-source articulatory
synthesizer library. The projects that I've found in this area have
been (a) praat, which has an interesting synthesis model but a poorly
documented and UI-locked "scripting" interface, and (b) festival,
which seems focused on providing a text interface to the speech
synthesis engine, whereas I'm looking for a lower-level programmable
interface. I was excited to run across gnuspeech and its compact TRM
code, hence the Python wrapper.

I'm a bit over-booked right now to commit to working on gnuspeech in
its entirety. :( But if I continue down this path of learning about
speech synthesis then I might encounter the project again soon at a
higher level. Until then, the TRM seems just what I need.

Please let me know what you think about the license issue, I'm rather
curious how numpy and GNU will interact.

lmj

On Thu, Feb 24, 2011 at 10:21, David Hill <address@hidden> wrote:
> Hi Leif,
>
> I will respond to the license question shortly but, in the meantime, I wonder 
> what you have in mind for a bare TRM package that simply converts incoming 
> parameter frames into audio samples. I am sure you are aware that there is 
> much more to speech in any language than the ability to "play" the TRM.
>
> There are matters of the dynamic state changes that represent articulation, 
> and questions of rhythm and intonation, not to mention the business of how 
> words are pronounced (text-to-phonetic-representation conversion), all of 
> which are embodied on the Monet and TTS client Apps.
>
> If you are interested in working on the project I hope you have read up on 
> the background documentation which explains much of this, and I would welcome 
> your contribution. We really need some good people to help finish the port 
> from the original NeXT version (which was compete) and then work on improving 
> it. A major need is for completion of the editing facilities in Monet, which 
> exist only as stubs at present.
>
> Monet is really key to creating the text-to-speech rules for different 
> languages, and it is the TTS client that is intended as the "speaking" 
> service for other Apps that need text-to-speech. Obviously, improving the 
> quality of the rules for speaking English also requires work with Monet (when 
> it is complete).
>
> The tube model by itself may be useful in its own right for psychophysical 
> experiments. The "Synthesizer" App -- which basically provides an interactive 
> interface to the TRM -- is an important tool for use with Monet in developing 
> the posture specifications for an arbitrary language, and also requires some 
> further work and testing, though the basic system functions well.
>
> I shall be interested in your reply.
>
> HTH.
>
> All good wishes.
>
> david
>
> On Feb 23, 2011, at 7:45 AM, Leif Johnson wrote:
>
>> Hi everyone -
>>
>> Many thanks to your work on gnuspeech, last week I was able to put
>> together---in a day !---a Python wrapper around the TRM, using SWIG
>> and the code from
>> svn://svn.sv.gnu.org/gnuspeech/osx/trunk/Frameworks/Tube (it seemed to
>> be the most recent version).
>>
>> I have hacked up a Python wrapper that allows one to create a
>> TubeModel and then pass in configuration parameters as a numpy array
>> (or a list of lists, anything that can be treated as a sequence of
>> 16-float frames), getting back a numpy array of 1-channel audio
>> samples. There's also a quick translation of the main() routine from
>> softwareTRM that allows a similar conversion from a configuration file
>> to an array of audio samples.
>>
>> I'd like to release this code as a Python package (so that it's easy
>> to install with pip), but I wanted to check with you all first to see
>> what you think. Are there any objections, or suggestions ? In
>> particular, I wasn't sure what the licensing rules would require --
>> GNU seems to suggest the GPL, but Python and numpy are more MIT-style.
>>
>> Please CC me with any responses. Thanks, and happy hacking !
>>
>> lmj
>>
>> --
>> http://leifjohnson.net
>>
>> _______________________________________________
>> gnuspeech-contact mailing list
>> address@hidden
>> http://lists.gnu.org/mailman/listinfo/gnuspeech-contact
>>
>>
>
>



-- 
http://leifjohnson.net

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [gnuspeech-contact] a Python wrapper for the TRM, Leif Johnson <=

Prev by Date: Re: [gnuspeech-contact] Some indications about portability
Next by Date: [gnuspeech-contact] GNUstep & gnuspeech
Previous by thread: [gnuspeech-contact] Bringing OSX state to GNUStep
Next by thread: [gnuspeech-contact] GNUstep & gnuspeech
Index(es):
- Date
- Thread