13 December 2006

Matlab exchange server examples

  1. sound_acquisition.m It reads in a number of seconds of audio from your computers soundcard and it is easy to change the duration, samplefrequency and soundcard.
  2. pitch determination algorithm (.zip) This zip-file contains a few files: the algorithm itself and some post-processing things.
    • The input Y specified here is not a wave file but the vector. I wrote the wavread component inte the m-file.
    • If you specify the input and the samplefrequency, it's enough.
    • The algorithm works but I don't really understand the outcome.
      % Input parameters (There are 9):
      %
      % Y: Input data
      % Fs: Sampling frequency (e.g., 16000 Hz)
      % F0MinMax: 2-d array specifies the F0 range. [minf0 maxf0], default: [50 550]
      % Quick solutions:
      % For male speech: [50 250]
      % For female speech: [120 400]
      % frame_length: length of each frame in millisecond (default: 40 ms)
      % TimeStep: Interval for updating short-term analysis in millisecond (default: 10 ms)
      % SHR_Threshold: Subharmonic-to-harmonic ratio threshold in the range of [0,1] (default: 0.4).
      % If the estimated SHR is greater than the threshold, the subharmonic is regarded as F0 candidate,
      % Otherwise, the harmonic is favored.
      % Ceiling: Upper bound of the frequencies that are used for estimating pitch. (default: 1250 Hz)
      % med_smooth: the order of the median smoothing (default: 0 - no smoothing);
      % CHECK_VOICING: check voicing. Current voicing determination algorithm is kind of crude.
      % 0: no voicing checking (default)
      % 1: voicing checking
      % Output parameters:
      %
      % f0_time: an array stores the times for the F0 points
      % f0_value: an array stores F0 values
      % SHR: an array stores subharmonic-to-harmonic ratio for each frame
      % f0_candidates: a matrix stores the f0 candidates for each frames, currently two f0 values generated for each frame.
      % Each row (a frame) contains two values in increasing order, i.e., [low_f0 higher_f0].
      % For SHR=0, the first f0 is 0. The purpose of this is that when you want to test different SHR
      % thresholds, you don't need to re-run the whole algorithm. You can choose to select the lower or higher
      % value based on the shr value of this frame.
    • I think some nice functions in our field are found in the m-file (bottom to top): generate a window function, compute zero-crossing rate, post voicing checking, split signal into frames, determine the energy treshold of silence, determine whether the segment is voiced, unvoiced or silence, compute subharmonic-to-harmonic ratio, do FFT and get log spectrum
  3. AGC.m An automatic gain control script, might be interesting for all the tests it does on the input signal and because it can process both mono and stereo inputs. If we ever need gain, power or energy commands, here's the place to look first for.

No comments: