octal-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Octal-dev] Fourier Transforms and samples


From: n_nelson
Subject: Re: [Octal-dev] Fourier Transforms and samples
Date: Mon, 22 May 2000 11:38:52 -0700

  Steve Mosher wrote:

  ----

  There ought to be sound cards that do this...  to boot, clusters are
  impractical for home use, not so bad for studio use (good idea,
  actually), and horrible for set use.  It would be -sweet- to see a
  cluster devoted to processing FFTs -- it could be optimized to do
  simply that, and a higher load could be placed on it.  It would be
  more useful to have it perform all your DSP...  then again, wouldn't
  it be more efficient (and cheaper) to rip apart a PSX2 and use the
  sound synthesizer?  I still want to see Octal ported to it =).  I
  think it would be the best platform thus far for such a program...

  ----

  In trying to get up-to-speed on the Octal-DSP situation I reviewed
  many of the links on www.dspdimension.com/html/links.html.  But in
  reviewing those many links, the general approaches, in my opinion,
  divide into two basic themes: (1) digital processing of sound by
  emulating analog methods and (2) digital sound synthesis using
  Fourier-description-like methods.  This division of approaches
  derives from the prior analog limitations of music analysis toward
  the desired Fourier description on one hand (or interesting
  analog-derived effects), and on the other, the essentially Fourier
  based generating methods used, in say, musical scores.  A sheet of
  music identifies the fundamental frequencies, their start and stop
  times, amplitudes, and harmonic formants (defined according to the
  instrument selected to play the passage) and designated method of
  playing. We have: DSP effects applied to sampled sound with some use
  of Fourier description approximation, and digital synthesizer
  methods using various Fourier description formats (of which I expect
  Octal is one--what is the web site for Octal?). From my perspective,
  these are approximations (perhaps some very good to precise) of (1)
  the forward Fourier transform (analysis) and then (2) the reverse
  transform (synthesis).

  And it may be the case that some approaches are not interested in or
  related to Fourier descriptions, but the primary emphasis should be
  Fourier descriptions because our human awareness of sound is
  essentially Fourier because the ear performs that transform and
  provides a Fourier description to the brain.  We hear, in our brain
  or awareness, Fourier descriptions.  And it is that medium we should
  concentrate on.

  On the DSP analysis side, we never seem to get to a point that gives
  a high quality control over the Fourier description because of
  either: the significant computational requirement, the only very
  approximate result (distortion), or the lack of simplification or
  additional translation toward music score or common synthesis
  formats. On the synthesis side, we have somewhat blocky, unnaturally
  sounding tools where the result may sound electronic or is unable to
  address the analysis description because of a significant gap
  between the analysis description and common synthesis
  formats/descriptions upon which our tools are designed.  I.e., there
  is an undesirable gap between the analysis and synthesis
  descriptions.  And then as that gap is closed, additional tools will
  be required/desired to manipulate the eventual, optimal description.

  My interest concentrates on the recording-studio where the real-time
  issue can be reduced.  Whereas Steve's interest appears toward
  real-time usage.  In considering the real-time aspect with utilizing
  pre-analyzed descriptions, it should be noted that our music scores
  or common speaking of music is a simplification based on
  fundamentals (the lowest frequency) and other information
  indexes/abbreviations.  We would then need to perform the analysis
  toward obtaining that simplification such that a real-time analysis,
  say, of an instrument could proceed quickly to the indexes--e.g.,
  use routines to get the fundamental frequency while ignoring the
  other frequencies--and then use those indexes to generate a complete
  sound based on the pre-analyzed description.  Once the fundamental
  frequency is obtained, the harmonic formant, decay envelopes, and
  various other description components would be generated.

  The recording studio and real-time approaches are complementary in
  that the real-time approach is essentially the recording-studio
  synthesis portion with a minimal analysis portion.  For real-time we
  could do the analysis portion in the studio to obtain the
  pre-analyzed real-time-use descriptions. In the studio, we will want
  to use many of the real-time techniques in the synthesis portion.

  The primary objective is to obtain a high quality but maximally
  simple Fourier description in the analysis phase that will bridge
  easily to the common techniques used in the synthesis phase.  I will
  detail my general program in the following, and if it is of any
  useful merit, I hope proper credit will be given.

  Stephan Sprenger at www.dspdimension.com/html/pscalestft.html
  detailed the STFT procedure toward obtaining the precise frequencies
  that become a key requirement of the hoped-for Fourier description.
  What does not seem to be commonly realized, likely because of the
  thorough, and almost necessarily practical use of the FFT on
  historically low-powered computers, is that once the primary
  frequency nodes have been estimated, as Stephan Sprenger
  illustrates, variations on the Complete Fourier Transform algorithm
  (CFT in the following), which is commonly considered too slow, can
  be applied where the computational requirement is linear or
  proportional to the number of frequency nodes and not exponential in
  sample size. And I suspect it will be more computationally efficient
  to increase the frequency determination precision after an initial
  STFT determination using the CFT as against using increasingly
  overlapping STFTs. I.e., if you know the local range of the expected
  frequency, just run a convergent sequence of individual frequencies
  using the CFT.

  Even though the FFT (core algorithm of the STFT) has a lower
  computational load (n log n) for a given number of samples n, the
  FFT computes for _every_ discrete frequency determined by the number
  of samples.  Whereas a CFT can be limited (LCFT) to look at only
  _one_ frequency at a time such that the resulting additional load is
  a function of the number of frequency nodes and _not_ the number of
  samples.

  The LCFT obtains an advantage on a per-frequency basis in that it
  can be tuned to an arbitrary fractional frequency.  And it can be
  tuned (convergent methods applied) to other functions such as exact
  frequency start time, amplitude envelope, and frequency movement.

  The output of the FFT is a sequence of Fourier description blocks
  (possibly simultaneous sequences of different length blocks) such
  that the description is dense and highly time divided.  And this FFT
  format must be maintained if a reverse FFT method is used for
  synthesis.  However I expect that in most common synthesizers,
  individual frequencies are generated and then added together, as in
  a reverse, LCFT, as against attempting a reverse FFT, primarily
  because, again, the load is a function of the number of frequencies
  and not sample size.

  The output of the LCFT will also be more dense and time divided than
  we would like which requires a further simplification procedure.
  E.g., one LCFT frequency description over a small length of time may
  be easily combined with subsequent descriptions giving a longer
  envelope that becomes a single frequency description.  Frequency
  harmonics are identified that begin and decay together and these may
  be assembled into sound objects or notes from a particular
  instrument.  The coding for this simplification procedure represents
  a kind of lossy compression, or finding of a smaller best fit from
  AI (pattern recognition).  This kind of compression then provides a
  simplified Fourier description that would be used in synthesis.

  In recap: use the FFT to obtain the nodes of approximate frequency,
  use the LCFT to obtain the precise descriptions, and use AI
  compression to obtain the simplified descriptions.  As an additional
  note, the entire analysis-synthesis sequence can be bounded within
  an (AI) optimization sequence where the synthesis output (the sample
  stream) is subtracted from the input, and convergence factors are
  selected for: minimum distortion (smallest combined output signal),
  smallest simplified description, and smallest computational load.
  Also different kinds of music or sounds, as in the real-time
  scenario, can be identified in the analysis phase to tune/minimize
  that computational load.

  Neil Nelson address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]