[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Resample block for audio signal

From: Marcus Müller
Subject: Re: [Discuss-gnuradio] Resample block for audio signal
Date: Mon, 14 Mar 2016 13:05:53 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

Hi Murray,

sounds like a nice thing to think about in a coffee break:

pitch shifting with "real" instruments and vocals is a bit of a complex problem to make it sound good [1], but yes, a special kind of resampling is probably what you're after.

I can achieve an octave of the signal multiplying it by itself (doubling the frequencies).
Is that true? I can see that for a single tone, because for an input signal $s_1 = \cos(f_1
      t)$ you'd get

${s_1}^2 =
      \cos^2(f_1 t) = \frac{1}{2}\left(\cos (2 f_1 t) + 1\right)$,

but for a musical instrument, you'd typically have at least two tones making up the timbre:

$s_2 = \cos(f_1
      t) + \cos(f_2 t)$, so
${s_2}^2 =
      \cos^2(f_1 t) + 2\cos(f_1 t)\cos(f_2 t) + \cos^2(f_2 t)$, of which we know the form of the first and last, term, yielding
${s_2}^2 =
      \frac{1}{2}\left(\cos (2 f_1 t) + 1\right) + s_{mix} +
      \frac{1}{2}\left(\cos (2 f_2 t) + 1\right)$, with $s_{mix} =
      2\cos(f_1 t)\cos(f_2 t)$.

Now, $\cos(x)\cos(y) =
      \frac{1}{2}\left(\cos(x+y)+\cos(x-y)\right)$, which means that the signal $s_{mix}=\cos\left((f_1 + f_2)t\right) + \cos\left((f_1 -
      f_2)t\right)$, containing both first intermodulations of $f_1$ and $f_2$, i.e.
${s_2}^2 =
      \frac{1}{2}\left(\cos (2 f_1 t) + 1\right) + \cos\left((f_1 +
      f_2)t\right) + \cos\left((f_1 - f_2)t\right) +
      \frac{1}{2}\left(\cos (2 f_2 t) + 1\right) =
      \frac12\left(\cos(2f_1 t) + \cos(2f_2 t)\right) +  \cos\left((f_1
      + f_2)t\right) + \cos\left((f_1 - f_2)t\right) + 1$
which means the resulting signal would have four times as much energy at the "unwanted" intermodulations than at twice the input frequencies, which is especially problematic since $f_1+f_2$ will probably be somewhere between $2f_1$ and $2f_2$, and hence can't be filtered out (like the DC component $+1$).

So, yes, intuitively, going through frequency domain does sound good:

      \rightarrow \textrm{FFT} \rightarrow \text{Magic}_\text{good, old}
      \rightarrow \textrm{IFFT} \rightarrow s_{out}$

However, we know that the Magic used here cannot be linear, because multiplication with something in frequency domain is equal to convolution with its time domain equivalent, and convolution is linear, and linear operations don't shift frequencies individually by different amounts (e.g. $f_1$ needs to be shifted by $f_1$, unlike $f_2$, which needs to be shifted by $f_2$). One way would be to interpolate in frequency domain (notice that the interleaver at the right top is de facto a simple interpolator):
The resulting signals would behave like this (top time, bottom frequency domain of in- and output)

Notice a few things:
The FFT/IFFT + interleaver is not doing any filtering, and that's what you see as jagged discontinuity in the signal spectrum. I don't think this is going to sound overly nice with "real" sounds, but it might "do the job"; you know these "springy" sounds you get when your GSM connection is really bad. That.
On the other hand, doing filtering in frequency domain (as in "using a proper interpolator rather than just inserting zeros") will have side effects on the time domain signal, and render it pretty unusuable (multiplication with the repeated inverse fourier transform of the filter response...).

I'm really no expert in Audio processing, but I guess there's more to this problem than basic operations.

Best regards,

[1] I wrote something completely "out of the blue". It has to do with goats. Which is not my fault. But I found it funny enough to write an answer.

On 13.03.2016 13:29, Murray Thomson wrote:

This is probably an easy one but I'm stuck and i could do with some help. My goal is to get a musical note from the microphone and shift its frequency to transform the note to a different scale. For this to happen, I need to multiply all the frequencies for e.g. 1.5.

I can achieve an octave of the signal multiplying it by itself (doubling the frequencies). I thought I could do this resampling the signal but now I'm not too sure. Do I need to use an FFT block for this?

I would appreciate if someone can suggest the best way to go or point me in the right direction.


Discuss-gnuradio mailing list

Attachment: re_pitching.grc
Description: application/gnuradio-grc

reply via email to

[Prev in Thread] Current Thread [Next in Thread]