28 February 2007

Reading at home and matlabsession in ESAT on HPS and a interface between effects

I finally found an article on that specific thing we are working on. (uch, some articles):
Very interesting of course.
In the first article they speak of mapping functions and a control curve. Also the control scaling part is nice.

I'm going to read the other ones now too, because I tried some other things in matlab, but they wouldn't work well: a little bit more explanation:
  1. I implemented the basics of the HPS algorithm as promised yesterday. Basically it's allright, but I don't know if it works because firstly I didn't yet implement the deciding code (which decides which peak is the current pitch), secondly because I didn't think about frames or blocks (Maybe this is not necessary, but I didn't yet think about this), thirdly because there's a seemingly stupid thing happening in matlab: it doesn't recognize the file I want to play. (Solved: I have to give the attributes as a string, so encapsulated with ' ' )
  2. I also tried to make an applyEffect.m-like file that selects the extraction algorithm and effect, and then reads the file, applies the effect with the parameters from the extraction algorithm and plays the file. This is complexer than I thought, so I need to read or think about the control curve some more.

The official title of our thesis

Our department asked to give them the official title of our work. They need it for the official webpages about thesises on the website of KUL.
Little brainstorm:
  • It's live music we're dealing with. Live as in real-time. Also as in 'with public' but that doesn't change anything for us.
  • It's music we're dealing with. If it's not necessary to specify which kind of music I wouldn't do it. But it might be appropriate to make a coarse distinction anyway. Something like: electro-acoustic music, or modern music (although this does not necessarily cancel out classical music). Maybe this isn't even necessary because the fact that you use audio effects already implies that you won't be doing classical music. Woops, not true: some effects might be useful for classical live conceerts too. What to do with this item?
  • Digital effects of course
  • Intelligent is a better name then adaptive if you ask me, although adaptive is already used in literature [verfaille, arfib]
Extensively comment on this post please!

27 February 2007

matlab session on pitch tracking using harmonic product spectrum


I'm hoping to implement a working matlab algorithm based on the HPS method.
Some information on it:

Much time 'lost' on the weekly meeting.
I started implementing HPS, but I'll continue tomorrow, nothing worth to say for now.
I also tried the frame-by-frame idea. Have to study it, but I made an implementation in the thesis folder on ESAT.




In between the soup and potatoes (dutch saying) I added a mixer section to the applyEffect algorithm:
%% mixer
inputSignalMixer = outputSignalEffect(1:length(originalSignal)); % cut any extra samples away for the wet and dry signals to match
outputSignalMixer = wet * inputSignalMixer + (1 - wet) * originalSignal; % mix wet and dry signal

weekly meeting

A very long meeting, many things are said. I'll try a conclusion:
In this stage we see too many possibilities, too many ways we can walk. That's normal.
  • A good idea is to keep on swimming in this chaos for a few days and then try to decide on some things and ordering everything in our minds.
  • We discussed a while on realtime issues and complexity issues:
    Following Toons' opinion there are two extremes for us. I made a little figure of them:

    The first one shows the most difficult one: extracting information from the signal upon which we want to apply the effect.

    The second one is complexer, but easier. Here we have the whole spectrum of chosen tracks we want to use to extract different features from the music. Best is to use them all at once as an input, then the algorithm can choose the tracks it needs for the solution. This feature vector is then used as an input for effects that are applied on different tracks of choice (not necessarily the same tracks that are used as input for the feature extraction.)

    -28feb2007-
    There is another well defined extreme I thought of: It's like the second idea, but using only the master tracks as an input for the extraction algorithms. This means you have your 'intelligent effect box' and you plug into it a stereo mic or the outputs of the stereo masters of the mixer. The algorithms have to segment the music then or at least they have to do their thing without knowing specific knowledge about the input. They must be able to deal with noise from the public. This is a problem we won't handle.
  • I talked also a lot about the interface of the effect blocks and extraction blocks. Probably the best and only choice is to use a modular structure. But in that case I want to have as soon as possible a standard interface.
  • We will never reach the stage where we implement the effect in C or something like that, the farthest we'll get might be simulink (modular and realtime). It might be the right time to switch to simulink sometimes already.
  • Obviously I forgot a lot of details here... someone?

26 February 2007

time based pitch tracking in matlab


Today I'll try a basic algorithm based on long term prediction (LTP). See DAFX p348 and further. It is optimized for a singing voice. This optimization is done in the post-processing part. I have to change the main algorithm the same way I did with the frequency based algorithm of the previous post.
Some calculations were needed to make sure the matrix dimensions of the for-loop are not exceeded.
indexMax = ((blocks-1)*K) + N + pre
-> ((blocks-1)*K) < Nx
-> Nx = n1 - pre = n1 - lmax
< n1 - pre + N + pre
= n1 + N
So we need to pad the X-vector with N extra zeros:
X = [X,zeros(1,N)]; % zero-padding until new length
The algorithm, with indicated external functions and default values:
% M-file 9.23
% pitch_tracker_ltp.m
%
% main file for a pitch tracker based on long-term prediction in time domain
% (c) 2002 Florian Keiler

fname='x1.wav';
%n0=2000; %start index
%n1=210000;
[X,Fs]=wavread(fname);
n0=1;
n1=length(X);

K=200; % hop size for time resolution of pitch estimation
N=1024; %block length

% checked pitch range in Hz:
fmin=50;
fmax=800;
b0_thres=.2; % threshold for LTP coeff
p_fac_thres=1.05; % threshold for voiced detection
% deviation of pitch from mean value

%[xin,Fs]=wavread(fname,[n0 n0]); %get Fs
%lag range in samples:
lmin=floor(Fs/fmax);
lmax=ceil(Fs/fmin);
pre=lmax; %number of pre-samples
if n0-pre<1
n0=pre+1;
end
Nx=n1-n0+1; %signal length
blocks=floor(Nx/K);
% blocks=floor(length(X)/K);%number of blocks
Nx=(blocks-1)*K+N;
%[X,Fs]=wavread(fname,[n0-pre n0+Nx]);
X=X(:,1)';
X = [X,zeros(1,N)]; % zero-padding until new length


pitches=zeros(1,blocks);
for b=1:blocks
x=X((b-1)*K+(1:N+pre));
[M, Fp]=find_pitch_ltp(x, lmin, lmax, N, Fs, b0_thres);
if ~isempty(M)
pitches(b)=Fs/M(1); % take candidate with lowest pitch
else
pitches(b)=0;
end
end

%%%% post-processing:
L=9; % number of blocks for mean calculation
if mod(L,2)==0 % L is even
L=L+1;
end
D=(L-1)/2; %delay
h=ones(1,L)./L; % impulse response for mean calculation
% mirror beginning and end for "non-causal" filtering:
p=[pitches(D+1:-1:2), pitches, pitches(blocks-1:-1:blocks-D)]; % 2D samples longer
y=conv(p,h); % length: blocks+2D+2D
pm=y((1:blocks)+2*D); % cut result

Fac=zeros(1,blocks);
idx=find(pm~=0); % dont divide by zero
Fac(idx)=pitches(idx)./pm(idx);
ii=find(Fac<1 & Fac~=0);
Fac(ii)=1./Fac(ii); % all non-zero element are now > 1
% voiced/unvoiced detection:
voiced=Fac~=0 & Fac<p_fac_thres;

T=40; % time in ms for segment lengths
M=round(T/1000*Fs/K); % min. number of blocks in a row
[V,p2]=segmentation(voiced, M, pitches);
p2=V.*p2; % set pitches to zero for unvoiced

%plotting things
figure(1)
clf
time=(0:blocks-1)*K+1; %start sample of blocks
time=time/Fs; %time in seconds
t=(0:length(X)-1)/Fs; % time in sec for original
subplot(211)
plot(t, X)
title('original x(n)')
axis([0 max([t,time]) -1.1*max(abs(X)) 1.1*max(abs(X))])

subplot(212)
idx=find(p2~=0);
plot_split(idx,time, p2)
title('pitch in Hz')
xlabel('time/s rightarrow')
axis([0 max([t,time]) .9*min(p2(idx)) 1.1*max(p2(idx))])


% synthesize sinusoids:
y2=synth_sine_fade(p2,K,Fs);
soundsc(y2, Fs)
wavwrite(.99*y2,Fs,[fname, '_pitch_ltp'])


This is the result with default arguments:
    • n0 = 1
    • n1 = length(X)
    • K = 200
    • N = 1024
    • fmin = 50
    • fmax = 800
    • b0_thres = .2
    • p_fac_thres = 1.05

The parameter K has the same effect as in the previous post.
Already having a very low threshold value I assumed the algorithm would be able to track x2.wav (a musical piece with strong melody) too (as seen in the previous post). But this is not true here. Whether this is inherent to the algorithm itself or due to the post-processing part I cannot tell. With a quick look on the code it seems that for both algorithms tested until now, the post-processing part is exactly the same, so this suggests that the algorithm itself is not fit for tracking musical melodies.
The threshold value has another effect apparently, because when I enlarge it, the results for x2.wav are somewhat better. I might check the exact meaning of the threshold value, but first I'll try some other algorithms, necause it seems that this one is inherently not fitted for our purpose.

23 February 2007

matlab session on pitch tracking ctd.


I started analyzing the FFT based pitch tracking algorithm described in DAFX pages 337 and further.
First I did the find_pitch_fft.m file:
function [FFTidx, Fp_est, Fp_corr] = find_pitch_fft(x, win, Nfft, Fs, R, fmin, fmax, thres)

%[x,Fs] = wavread(soundFile);
%x = x(1:Nfft+R);
% [FFTidx, Fp_est, Fp_corr] = find_pitch_fft(x, win, Nfft, Fs, R, fmin, fmax, thres)
%
% M-file 9.15
% find_pitch_fft.m
%
% find pitch candidates of given signal block x
% x: input signal of length Nfft+R
% win: window for the FFT
% Nfft: FFT length
% Fs: sampling frequency
% R: FFT hop size
% fmin, fmax: minumum/ maximum pitch freqs to be detected
% thres: omit maxima more than thres dB below the main peak
%
% (c) 2002 Florian Keiler

FFTidx = []; % FFT indices
Fp_est = []; % FFT bin frequencies
Fp_corr = []; % corrected frequencies
dt = R/Fs; % time diff between FFTs
df = Fs/Nfft; % freq resolution
kp_min = round(fmin/df);
kp_max = round(fmax/df);
x1 = x(1:Nfft); % 1st block
x2 = x((1:Nfft)+R); % 2nd block with hop size R
[X1, Phi1] = fftdb(x1.*win,Nfft);
[X2, Phi2] = fftdb(x2.*win,Nfft);
X1 = X1(1:kp_max+1);
Phi1 = Phi1(1:kp_max+1);
X2 = X2(1:kp_max+1);
Phi2 = Phi2(1:kp_max+1);
idx = find_loc_max(X1);
Max = max(X1(idx));
ii = find(X1(idx)-Max>-thres);

%----- omit maxima more than thres dB below the main peak -----
idx = idx(ii);
Nidx = length(idx); % number of detected maxima
maxerr = R/Nfft; % max phase diff error/pi (pitch max. 0.5 bins wrong)
maxerr = maxerr*1.2; % some tolerance
for ii=1:Nidx
k = idx(ii) - 1; % FFT bin with maximum
phi1 = Phi1(k+1); % phase of x1 in [-pi,pi]
phi2_t = phi1 + 2*pi/Nfft*k*R; % expected target phase after hop size R
phi2 = Phi2(k+1); % phase of x2 in [-pi,pi]
phi2_err = princarg(phi2-phi2_t);
phi2_unwrap = phi2_t+phi2_err;
dphi = phi2_unwrap - phi1; % phase diff
if (k>kp_min) & (abs(phi2_err)/pi<maxerr)
Fp_corr = [Fp_corr; dphi/(2*pi*dt)];
FFTidx = [FFTidx; k];
Fp_est = [Fp_est; k*df];
end
end

Problems I encountered:
  • I erased Fs as an input argument, because I can get directly from the wavread command. This posed no problems until I used another function that calls this function. For both functions to use the same Fs, it's necessary to send it as an input argument.
  • When inputting the argument win I must describe it like this:
    find_pitch_fft('La.wav',hann(1024),1024,1,20,15000,40)

Then after that I went on to the Pitch_Tracker_FFT_main.m file.

In the main function only I encountered some problems:
  • I had to correct the first line like this:
    I had to add the apostrophes and the .wav extension
    fname='x1.wav';
  • Next thing was that they used the algorithm optimized for a specific wavfile. To generalize this I had to adjust the n0 and n1 boundaries. I reasoned, what is the most general solution for this, and it would be to set n0 to zero and n1 to the number of samples the given piece contains.
    %these are optimized for x1.wav
    %n0=2000; %start index
    %n1=210000;
    %I'll try it in general
    [X,Fs]=wavread(fname);
    n0=0
    n1=length(X)
    This gave some problems where they do some changes on n0 and n1, because suddenly the new n1 would be bigger than the length of the piece. I solved this by just eliminating this part. The algorithm still works then, so I ask why this part was contained in the algorithm. (OK)
    %Nx=n1-n0+1+R %signal length
    %blocks=floor(Nx/K)
    %Nx=(blocks-1)*K+Nfft+R
    %n1=n0+Nx % new end index
  • But this gives a matrix dimensions exceede problem:
    for b=1:blocks
    x=X((b-1)*K+1+(1:Nfft+R));
    Strange that matlab doesn't seem to complain about the fact that blocks isn't defined
  • These and other adjustments kept creating new problems. I tried it otherwise:
    I followed the original algorithm step-by-step to see what it did.

    Nx = n1 - n0 + 1 + R
    %so it actually makes the chosen part longer
    %in my case, as I want n0 to be 1 en n1 to be the length of the audio sample, it makes Nx greater than the total length of the wave file.
    blocks = floor(Nx/K)
    %As far is I understand this divides the total number of Nx samples in Nx/K blocks, each of length K (floored to the nearest smaller integer), and non-overlapping.
    %so this must be something else than the blocks for the FFT who are overlapping in their length-R.
    %anyone an idea what they're here for then? The algorithm talks about hop size for time resolution for pitch estimation (see lower for an answer (remarks after the algorithm))
    Nx = (blocks-1)*K + Nfft + R
    = ((floor(Nx/K))-1)*K + Nfft + R
    < ((Nx/K+1)-1)*K + Nfft + R
    = Nx + Nfft + R
    = n1 - n0 + 2*R + 1 + Nfft
    %the new end index becomes
    n1 = n0 + Nx
    < n0 + n1 - n0 + 2*R + 1 + Nfft
    = n1 + 2R + 1 + Nfft
    This is the maximum number of samples that are added to the original part of the audio sample, in my favored case this is the max number of samples added to the total audio sample.
    This now gives us the choice:
    • OR we do the wavread here. We must then care for the fact that the original n1 must be smaller than the total length of the audio sample minus 2*R + 1 + Nfft. Because now you will certainly not try to 'wavread' too many samples, samples that aren't there.
    • OR we can zero-pad the wave vector here until it has the new length n1-n0
      This is what I'll do here. It might give an small mistake for the last blocks that comprise these zeros, but there will always be problems at the end of a sample. I think the zeros minimize thiese problems. (I can be wrong)
This brings us to the first part (that works generally):
% M-file 9.18
% pitch_tracker_fft_main.m
%
% main file for a pitch tracker based on FFT with phase vocoder approach
% (c) 2002 Florian Keiler

fname='La.wav';
[X,Fs]=wavread(fname);
n0=1
n1=length(X)

Nfft=1024;
R=1; % FFT hop size for pitch estimation
K=200; % hop size for time resolution of pitch estimation
thres=50; % threshold for FFT maxima
% checked pitch range in Hz:
fmin=50;
fmax=800;
p_fac_thres=1.05; % threshold for voiced detection
% deviation of pitch from mean value

win=hanning(Nfft)'; % window for FFT

Nx=n1-n0+1+R %signal length
blocks=floor(Nx/K)
Nx=(blocks-1)*K+Nfft+R
n1=n0+Nx % new end index
X=X(:,1)';
length_afterCalculatingNewLength = length(X)
for i=length(X):n1 this I changed to:
X(i)=0; X = [X,zeros(1,n1-length(X))];
end
new_length = length(X)

pitches=zeros(1,blocks);
for b=1:blocks
x=X((b-1)*K+1+(1:Nfft+R));
[FFTidx, Fp_est, Fp_corr]=find_pitch_fft(x, win, Nfft, Fs, R, fmin, fmax, thres);
if ~isempty(Fp_corr)
pitches(b)=Fp_corr(1); % take candidate with lowest pitch
else
pitches(b)=0;
end
end
Remarks on parameters:
  • If I enlarge K (1000) the algorithm works much faster, but, as you see, the time resolution is not so good.
  • If I make K smaller, the calculations take very long and the pitch tracking is much more fluent. Sometimes it has short faulty tones. Maybe we can filter these out assuming that the voice cannot change that fast and short. But this would stop us from using the algorithm for instruments...

    K thus sets the nuber of times that the main for-loop is run.
  • A very big (1000) value for R results in long calculations and technically seen a worse pitch estimation, see fig9.27, p337 in DAFX.
  • For input x2.wav to give a meaningful result I set thres on 1 (which is very small compared to the preferred 30-50). This means that we leave the allowed deviation from zero for the phase error very small. I don't really understand why that gives better results here. If I let the thres value have its default value, then we get nothing but two low blobs of sound somewhere in time.
Next time I'll try the time-domain based pitch tracking algorithm from DAFX. For now I'm stopping here.

22 February 2007

matlab session on pitch detection, but more about ring modulator etc.

Also today I do a little warming up: I'd like to give a it a shot and implement a ring modulator (DAFX p76). The ring modulator of my yamaha MO8 has the following parameters:
    • oscillator frequency coarse (0.5 - 5 kHz)
    • oscillator frequency fine (0 - 127)
    • LFO wave (tri, sine)
    • LFO depth (0 - 127)
    • LFO speed (0.0 - 39.70 Hz)
    • HPF cutoff frequency
    • LPF cutoff frequency
    • dry/wet balance
    • EQ low frequency
    • EQ low gain
    • EQ high frequency
    • EQ high gain
Only the first five comprise the ring modulation, 6 and 7 are filter parameters, 8 is a mixer parameter and the rest are EQ parameters.
Some remarks:
  1. I used
    modSound = sin(2*pi*OSCfreq*[0:1/Fs:Nbits]);
    but the Nbits indicates the quantization depth (16 bits when used with the wave file flute2.wav in my folder). Instead I have to use the length of the sound vector (as seen in the algorithm) when I want the modulating signal to be as long as the soundFile input.
    What I still don't get is why I have to divide the length of the vector sound with Fs. It is a kind of normalization, and if I don't do it, it wouldn't work. But I don't know why.
    solution: Try it with a ridiculous 2Hz sampling frequency and 10 seconds music sample. You'll see that if you want as many samples on the sine wave as on the input sample you'll have to divide the length of your input minus 1 through 2 to get to the 10 seconds.
  2. I spent a reasonable amount of time in generating different signals to be used as modulators; sine wave, triangular wave (special case of:), sawtooth, square wave. Everything works allright and the explanations are to be found in the m-file.
  3. The algorithm:
    function [output] = ringmod(soundFile, OSCfreq, LFOwave, LFOdepth, LFOspeed)
    % ring modulator
    % y(n) = x(n).m(n)
    % OSCfreq: oscillator frequency (best between )
    % LFOwave: type of wave that modulates the audio signal (sine, triangle, square, sawtooth with standard width)

    [sound,Fs,Nbits] = wavread(soundFile);

    % defining the modulating wave type 'LFOwave'
    % SINE
    if (strcmp(LFOwave, 'sine')) % use strcmp instead of == to compare strings
    % sampled taking steps of Ts or 1/Fs and this until length(sound)/Fs samples are
    % calculated
    % number of vector values 'length(sound)' must be normalized by Fs
    % because ... (?)
    modSound = sin(2*pi*OSCfreq*[0:1/Fs:(length(sound)-1)/Fs]);
    end
    % SAWTOOTH
    if (strcmp(LFOwave, 'sawtooth'))
    width = 0.9 % this defines where the max of the wave is situated in the interval between 0 and 2*pi
    modSound = sawtooth(2*pi*OSCfreq*[0:1/Fs:(length(sound)-1)/Fs],width);
    end
    % TRIANGLE
    if (strcmp(LFOwave, 'triangle'))
    %triangle wave (width = 0.5)
    modSound = sawtooth(2*pi*OSCfreq*[0:1/Fs:(length(sound)-1)/Fs],0.5);
    end
    % SQUARE
    if (strcmp(LFOwave, 'square'))
    modSound = square(2*pi*OSCfreq*[0:1/Fs:(length(sound)-1)/Fs]);
    end

    %wavplay(modSound, Fs); %for testing
    %sound(1:100)
    %modSound(1:100)
    %length(sound)
    %length(modSound)
    output = sound .* modSound';

    wavplay(output, Fs);
  4. This effect is perceived as nice eg. when using a voice as input, sine or other wave as modulation and low frequencies (~50Hz)
  5. I had some problems with the multiplication, but after trying it with small matrices I saw why:
    • I had to use a dot-product for elementwise multiplication .*
    • I had to have two columnmatrices or vectors. The modSound waves are on the contrary row matrices, so I had to transpose them: modSound'

OK, a warmup during 3 hours, typical...
So now a little bite and then: pitch extraction!

Hm, another two hours spent on surfing the internet and finding sites we could use. Look at the posts involved: Links, just links , some links for beat extraction.
Also I started a post that will contain ideas we have during our work. This way we have a leading trail when experimenting later on.
OK, another try towards pitch detection.

I'll talk about pages 336 and further in DAFX.
Pitch extraction is the same as estimating the fundamental frequency f0 and then possible postprocessing like pitch tracking and taking into account frequency relationships.
I'll be reading the interesting things and tomorrow I'll try the matlab code.

Links, just links

Ideas, just ideas

  1. detect when a song is being played or not and use this information to turn on and off the reverb on the singers' voice.
    At first I thought using the singers' mic as an input, but that's not enough. Practically the algorithm must know the difference between someone talking and someone singing. Is it enough to do this by detecting when music is playing? No, because when singing (parts) in a capella the reverb must remain switched on!
    see http://ieeexplore.ieee.org/iel5/9248/29346/01326825.pdf
  2. detect the tempo of a song and use it as the delay time parameter of a delay effect.
  3. detect the pitch/tone at which a singer is singing and (when switched on) automatically pull the pitch to the closest existing note. Practically this might only be useful for specific notes, those notes that are difficult for the singer to reach. For the rest it must be turned off, because the importance of a singer as an instrumentalist is that he can form notes between existing ones, and not to forget glissandos too.
    Might already exist: autotune (DAFX p336)
  4. Another one that is metioned also in DAFX p336: detect amplitude and use it as a parameter for compression
  5. detect no silence and let the compressor with high output gain only work then. This way when there is silence the noise will not have such a high gain and disturb the sound.
  6. maybe a compressor that only compresses the track in the frequency bands (or even as detailed as the pitch itself) it's currently playing. Wait, I thought of this as means to not make the noise problems worse in silent passages, but this doesn't change. Better idea: same idea but with a noise gate.

some links for beat extraction

http://www.fxpal.com/publications/FXPAL-PR-01-152.pdf
http://www.fxpal.com/publications/FXPAL-PR-01-022.pdf
http://www.csl.sony.fr/downloads/papers/2002/ZilsMusic.pdf automatic extraction of drum signals from polyphonic music signals

In a little chat with toon we talked about the problems so far with beat detection.
A possible part of the solution is to append weight factors to the frequency bands, so that the most important band has the biggest weight. The question is how to calculate these factors. The best idea is to let them be calculated automatically, by means of correlation values or something for every band.
Also we talked about bigger time windows, as in why don't we look for repeating riffs or measures. This might work better with more difficult pieces of music. This brought us to googling on "measure detection" and other stuff. But this is a TODO.

measure extraction music
This particular search term discovers loads of interesting websites. This is done in the post: Links, just links.

another idea:
why not use ICA but for audio then, to extract independent components like a drum line?

21 February 2007

matlab session on pitch extraction

+tony, ronny+

  1. I wanted to try another effect before starting with pitch detection algorithms. The one I chose was guitar distortion effect.
    I stumbled upon a very good link: www.musicdsp.org
    There you can find an archive on analysis, effects and some more. It seems that most of the coding is done in C or C#. The distortion effect I tried here is translated to Matlab code.

function [x]=gdist(a,x)
%[Y] = GDIST(A, X) Guitar Distortion
%
% GDIST creates a distortion effect like that of
% an overdriven guitar amplifier. This is a Matlab
% implementation of an algorithm that was found on
% www.musicdsp.org.
%
% A = The amount of distortion. A
% should be chosen so that -1<A<1.
% X = Input. Should be a column vector
% between -1 and 1.
%
%coded by: Steve McGovern, date: 09.29.04
%URL: http://www.steve-m.us
[x,fs] = wavread(x);
k = 2*a/(1-a);
x = (1+k)*(x)./(1+k*abs(x));
wavplay(x,fs);
I also tried another one:
function [output]=fuzzy(sound, amount)
[sound,fs] = wavread(sound);
norms = norm(sound);
output = sound/norms;
amount = (1- amount)/100;
for i = 1:length(output)
if ( output(i) > amount )
output(i) = amount;
end
if ( output(i) < -amount)
output(i) = -amount;
end
end
output = output*norms;
wavplay(output,fs);
Both algorithms are to be found in my folder on ESAT (gdist.m and dist.m)

  1. see above
  2. I tried to catch a book which was referenced from an article I read (sorry, forgot which one). The book from Roads is called 'the computer music tutorial'. It became a real quest to find it. I didn't yet succeed. I'll keep trying.
  3. Order of the day. I'll experiment with pitch extraction: see the rest of this post


matlab session on beat detection

vandaag beat detection algorithm

enkele interessante links:
http://www.owlnet.rice.edu/~elec301/Projects01/beat_sync/beatalgo.html
http://homes.esat.kuleuven.be/~tvanwate/alari/projects/beat_detection/scheirer98.pdf

uitleg bij termen:
- wave rectifyers
http://en.wikipedia.org/wiki/Rectifier

filterbank.m:
- bl(i) = floor(bandlimits(i)/maxfreq*n/2)+1;
hierin is bandlimit(i)/maxfreq schaalfactor voor uw samples van uw FT
- nog eens nagaan wat juist nen DFT doet
zet n samples om naar n samples (op welke manier stellen die laatste
samples dan het frequentie domein voor ??---> avond lectuur)

andere m-files gewoon uitgevoerd:

de 4 stappen die worden toegepast zijn wel duidelijk
echter algoritme moet wat uitgediept worden...

eens uitgetest op hiphop schijf, kwam niet echt iets deftig uit
volgende keer eens uittesten met verschillende house schijven
beat zou dan gemakkelijker moeten kunnen gedetecteerd worden

conclusie: blijkt wel te werken
algoritme nog niet volledig door --> nog eens verdiepen
volgende stap uittesten om te zien of hij relatieve verschillen goe vastlegt

20 February 2007

matlab session on feature extraction

A short list of several feature extraction methods for audio:

Weekly meeting

We agreed on many ideas: after experimenting with effects in matlab, we should now start doing matlab on extraction of features in audio. That is what we're going to do first. Then the next step would be to connect extraction features with effects. All this might be done in matlab first.
Of course we really want this to be done in real time (not only to do easy testing, but also for client purposes) and modular.
If we could make standard effect blocks and standard extraction blocks in simulink that would be nice. Simulink might indeed be the best next step after matlab, eg to test in realtime (although this might require some reprogramming of the m-functions).
We also talked about the idea to turn everything in VST plugins or direct X plugins as a final step, but we might lack time for that.
TODO: Our promotor M. Moonen would like us to do an intermediate presentation. Here we might explain some theoretical issues about effects and extraction algorithms we already tried and also we can explain our project status and the next steps towards our goal.

13 February 2007

warming up session 2007

Exams finished, vacation finished. As for me, I'm almost healthy again.
Today Tony and I met to set things straigth and restart our discovery.
Our plan for the nearby future:
This week we will finish the basic literaturestudy (as in, we will read the DAFX book and surf across the internet a lot)
So next week we can start trying out some feature extraction methods (as we already did with effects)

On monday 19 feb we hope to meet with our assistants.