Speech Recognition
Really Vowell Recognition
- I just recently had a go at speech recognition.
- It turns out that you can do fourier transform of a .wav
file, and that's what the hair cells in the ear respond to.
- So, it's pretty easy to get the first layer of input.
- It also turns out that a lot of animals (e.g. gerbils) respond
to vowell sounds.
- The strongest frequency (e.g. middle c) is called the first
formant, the next the second formant and so forth.
- If you make a first formant by second formant matrix,
vowell sounds map to ellipses.
- It also turns out that parts of the cortex do this, but it may occur
earlier in the auditory pathways.
- So, I did a 5 fold test of my own voice using a (like father),
oo (like loot), and ee (like keep); I trained on 2 of each and
tested on 8.
- I normalized for volume.
- The system gets about 92% correct.
- I'm not really happy with this result, but it's a start, and
we should be able to take advantage of interleaved processing.