13 December 2006

Matlab exchange server examples

  1. sound_acquisition.m It reads in a number of seconds of audio from your computers soundcard and it is easy to change the duration, samplefrequency and soundcard.
  2. pitch determination algorithm (.zip) This zip-file contains a few files: the algorithm itself and some post-processing things.
    • The input Y specified here is not a wave file but the vector. I wrote the wavread component inte the m-file.
    • If you specify the input and the samplefrequency, it's enough.
    • The algorithm works but I don't really understand the outcome.
      % Input parameters (There are 9):
      %
      % Y: Input data
      % Fs: Sampling frequency (e.g., 16000 Hz)
      % F0MinMax: 2-d array specifies the F0 range. [minf0 maxf0], default: [50 550]
      % Quick solutions:
      % For male speech: [50 250]
      % For female speech: [120 400]
      % frame_length: length of each frame in millisecond (default: 40 ms)
      % TimeStep: Interval for updating short-term analysis in millisecond (default: 10 ms)
      % SHR_Threshold: Subharmonic-to-harmonic ratio threshold in the range of [0,1] (default: 0.4).
      % If the estimated SHR is greater than the threshold, the subharmonic is regarded as F0 candidate,
      % Otherwise, the harmonic is favored.
      % Ceiling: Upper bound of the frequencies that are used for estimating pitch. (default: 1250 Hz)
      % med_smooth: the order of the median smoothing (default: 0 - no smoothing);
      % CHECK_VOICING: check voicing. Current voicing determination algorithm is kind of crude.
      % 0: no voicing checking (default)
      % 1: voicing checking
      % Output parameters:
      %
      % f0_time: an array stores the times for the F0 points
      % f0_value: an array stores F0 values
      % SHR: an array stores subharmonic-to-harmonic ratio for each frame
      % f0_candidates: a matrix stores the f0 candidates for each frames, currently two f0 values generated for each frame.
      % Each row (a frame) contains two values in increasing order, i.e., [low_f0 higher_f0].
      % For SHR=0, the first f0 is 0. The purpose of this is that when you want to test different SHR
      % thresholds, you don't need to re-run the whole algorithm. You can choose to select the lower or higher
      % value based on the shr value of this frame.
    • I think some nice functions in our field are found in the m-file (bottom to top): generate a window function, compute zero-crossing rate, post voicing checking, split signal into frames, determine the energy treshold of silence, determine whether the segment is voiced, unvoiced or silence, compute subharmonic-to-harmonic ratio, do FFT and get log spectrum
  3. AGC.m An automatic gain control script, might be interesting for all the tests it does on the input signal and because it can process both mono and stereo inputs. If we ever need gain, power or energy commands, here's the place to look first for.

09 December 2006

unicomb, vibrato, wisperization

1) unicomb testing for toms_diner.wav
  • calculations much too long... must be done something about it.
  • after 20 minutes he did it...
  • works, but veeerrry slowly
2) trying to make vibrato work
  • too long in time
  • bad parameterchoice
  • now OK
3) testing of wisperization
  • book is not here, so I'm just fooling around a bit
  • effect is clearly audible, but somewhat too harsh
  • experiment a little with the parameters and turn the m-file in one where you can give the wav-file to.

06 December 2006

matlabsession ronny and tony

Today we try to implement some effects. Some notes:
  • Instead of wavplay() you can also use sound(vector,sampleFrequency), it does fairly the same. Also stereo if the matrix is Nx2. soundsc() even plays the sound at the maximum amplitude without clipping.
    With wavrecord() one can record wave audio from the windows wave input device.
  • When using wavread and wavplay and others, pay attention to play at the same sample frequency as read. Otherwise the sound will mostly play as if it passed through an lowpass filter, because the standard Fs = 11025Hz
  • Ctrl+C is used to terminate a calculation.
  • switch/case statement can be used in matlab too
  • m-files can make use of cells to make the overview clearer
I tried implementing a flanger effect I found on the mathworks exchange server. Herefore I made a function that I can use throughout our quest for cool effects:
function[] = applyEffect(effectFile,soundFile)
Example: applyEffect('flange','x1.wav');
<< soundfile =" 'redwheel.wav';" effectfile =" '';" variation =" .01;" rate =" .3;" outputsignal =" flange(inputSignal," outputsignal =" signal;">>
This works well. I tried it with flange and after experimenting with the parameters variation and rate it worked nice enough.
Now I'm trying to understand every line of flange.m for future use... Some notes:
  • : Colon.
    J:K is the same as [J, J+1, ..., K]
  • CEIL(X) rounds the elements of X to the nearest integers
    towards infinity.
  • ROUND(X) rounds the elements of X to the nearest integers.
  • and thus: CEIL(X) <= ROUND(X) CEIL(X) >= ROUND(X) >= FLOOR(X)
  • >> abs([1 2 3;4 5 6;7 8 -9])
    ans = 1 2 3
    4 5 6
    7 8 9
  • .* Array multiply.
    X.*Y denotes element-by-element multiplication. X and Y
    must have the same dimensions unless one is a scalar.
I almost succeeded understanding the whole m-file about flange. Got stuck with this line:
signal = signal(cosineSamples+modulation-lfo);
signal is a column vector and between the brackets there is a row vector, what could this mean then?

I include the flange.m file in the version of today:
function [signal]=flange(signal, sampleFrequency, variation, rate)
%[y] = flange(fs, v, x, r)
%
% This is a basic flanging effect.
%
% fs = Sample rate
% v = Variation.
% x = Input audio signal. This should be a column
% vector.
% r = Rate.
%
% Example:
%
% >>y = flange(fs,0.002,x,0.5);
%
%
% See also WAVREAD and SOUND
%
%Version 1.0
%Coded by: Stephen G. McGovern, date: 08.03.03

modulation = ceil(variation*sampleFrequency);
%a row containing a raising value per sample of the signal
cosineSamples = 1:length(signal)+ modulation;
variation = round(variation*sampleFrequency);
padding = zeros(modulation,1);
%maxSample is the biggest amplitude of the vector or matrix signal
maxSample = max(abs(signal))
signal = [padding;signal;padding];
%period in seconds
period = 2*pi/round(sampleFrequency*rate);
%approximates a cosine which variates about variation/2 as an LFO
lfo = round((variation/2)*(1-cos(period.*cosineSamples)));
%chooses the samples indicated by the vector
cosineSamples+modulation-lfo, from signal
signal = signal(cosineSamples+modulation-lfo);
maxSample = maxSample/max(abs(signal));
signal = maxSample*signal;

unicomb / delay / vibrato

1) unicomb filter 'all pass' mode implemented
  • calculations too long; why?
  • spectrogram plot: search/make something decent
  • watch out with commands wavread and wavwrite --> you have to give the right sampling frequency, otherwise your sound is totally distorted, while it's not the fault of your algorithm
  • we worked on a vibrato, nut didn't succeed; maybe just because previous point :( --> we should try it again !!
2) to implement delay mode
  • problems with calculation time... not able to use it because of duration/ why?
  • a delay with parameters M=20000 so approx. 0.5 seconds results in no sound when using La.wav (it's own duration is also 0.5sec) This is because everything falls outside the window then. But do you have to sum the original signal with this one? Because it seems that this is already done in the figure (DAFX ch3). --> We must sum, it is NOT done in the picture
3) vibrato
  • same problem: too long ! frustrating !
  • ok, it was because of the parameterchoice... shorter, but still not short enough.

05 December 2006

spectrogram function by Evan Ruzanski

function spectrogram(x,fs,bw)
% Plot grayscale spectrogram with variable sampling rate and bandwidth
% Evan Ruzanski, ECE259, 2/26/2003

% Set minimum FFT length
fftmin = 256;

% Set 2*fs/bw variable window length for good resolution with long block length
% will provide higher frequency resolution as main-lobe of the window
% function will be narrow and short block length will provide higher time
% resolution as less averaging across samples is performed for each
% STFT value
winlen = floor(2*fs/bw);

% Get FFT length
fftlen = max([winlen fftmin]);

% Create window (Hamming for favorable sidelobe attenuation) and zero pad accordingly
win = [hamming(winlen) ; zeros(fftlen-winlen,1)];
win = win/sum(win); %normalizing?
windel = (0:(length(win)-1)) * win;
??

hmm from hereon I loose it... someone?
% Set overlap (Effects expansion of spectrogram display as no overlap plots
% fftlen data points and maximum overlap gives 1 data point) ?? This I don't understand?
ntime = 500; % Choose based on trial-and-error for best looking plot
overlap = floor(max(fftlen/2, (ntime*fftlen-length(x))/(ntime-1)));
ntime = floor((length(x)-overlap)/(fftlen-overlap));

% Create arrays
c1=(1:fftlen)';
r1=(0:ntime-1)*(fftlen-overlap);

% Take FFT of real data
b = fft(x(c1(:,ones(1,ntime))+r1(ones(fftlen,1),:)).*win(:,ones(1,ntime)));
if size(b,1) == 1
m = length(b1);
b(floor((m+4)/2):m) = [];
else
m=size(b,1);
b(floor((m+4)/2):m,:) = [];
end
b = b.*conj(b);

% Setup pixel locations for plot on frequency axis and time axis
f=(0:fftlen/2)*fs/fftlen; % Point spacing
t = (r1+windel)/fs;

% Set limit for dB scale
lim = max(b(:))*0.0001;

% Set dB scale
b=2.5*log10(max(b,lim));

% Plot
imh = imagesc(t,f/1000,b);

% Set up axis and labels for plot
axis('xy');
title('Spectrogram');
xlabel('Time (s)');
ylabel('Frequency (kHz)');

% Set and apply grayscale levels
colormap(gray);
map2 = colormap;
map2 = 1 - map2;
colormap(map2);
colorbar; % Applies color legend to plot
orient landscape;

matlab session on my own

To login to ESAT from a remote computer I use Filezilla:
host: login.esat.kuleuven.be
port: 22
servertype: SFTP using SSH2
user: rvandenb
Here I hold the folder 'Thesis', where all the documents are placed.

To give labels to axes:
xlabel('Time');
ylabel('Frequency (Hz)');
I tried understanding the command spectrogram, but...
Even the chirp example given by matlab I don't understand. I now try searching google on it.

Found an at first sight, very exciting URL: http://www.nd.edu/~nkottens/
Some other m-files about audio on the mathworks file-exchange also with some effects!

There is a spectrogram function on this site with better visual effect.
I'll study this one in another post

weekly update and some matlab 5nov2006

Just before the first weekly update since like a month, I worked a little bit on matlab.
I tried making a simple m-file which does all kinds of standard stuff for audio processing. Load a file, play a file, view the plot and spectrogram.
Some points of attention:
  • subplot(numberOfRows, numberOfColumns, indexOfPlot)
    after that, comma-separated, you specify the plot (plot(), stem(),...)
    and then several things you like, like: title(''), ...
    indexOfPlot counts from left-to-right and top-to-bottom.
  • wavread:
    [soundFile,sampleFrequency] = wavread('La.wav');
    soundFile is a vector with all the samples (values between -1 and 1). If you want to see only the first n bits, you add n after the filename at the RHS of the equation. To see the vector you omit the point comma.
    sampleFrequency in Hertz
    You also can specify the number of bits used to encode every sample (Quantisation), therefore you add 'NBits' after 'sampleFrequency'
  • Supports multi-channel data, with up to 32 bits per sample.
  • This file reader only supports Microsoft PCM data format.
  • Also auread is possible
  • To control the axes of the plot a website previously specified uses: axis('tight')
    This sets the limit of the axes to the range of data. I think that the y-axis is too tight then, so I put it on auto (still quite tight though). What I search for is a command to set only the y-axis (to -1, 1)
  • I think it's best to always add 'clear all;' in the beginning of an m-file.
  • The website still used specgram, but matlab prefers spectrogram now.
    I will study this soon.
Then the update with toon and tony.
We really should start implementing some effects in matlab. When encountering problems we don't have to bother disturbing toon.
We talked a bit about simulink/matlab/... According to toon best is to implement in matlab. He thinks there's no way to implement realtime functions in matlab (although he heard that in a newer version of matlab there'd be a possibility to do exactly so). The normal procedure would be to translate the matlab code to C or JAVA.
More wasn't really said I think. Before the exams we must really try to implement some effects and get used to the matlab environment.
We also can start searching for articles on audio feature extraction.
That's all.

01 December 2006

Problems and questions encountered when working with matlab

I'm doing a matlab session today and it clearly needs a warm-up.
In almost every m-file I try from the book DAFX, I either don't fully understand it, either it gives errors, either I don't see how it can be used with an audio fragment.
A short google search on "matlab commands functions for digital audio" gives me a few interesting looking URL's:
http://ccrma.stanford.edu/~jos/filters/Matlab_Utilities.html
and further on "audio in matlab":
matlab audio processing examples
http://vise.www.ecn.purdue.edu/VISE/ee438L/matlab/help/pdf/audio.pdf
audio processing


Some questions I have:
  • How do you allow figures to be in one window? And with a title...
  • What must you do when a function used in an example m-file of the book DAFX isn't recognized by matlab? (like CreateNewTrack)