Microphone input, sound detection.

I want to make a program that can input sound from a microphone, and determine if the sound is one of three pitches.

How can I input the sound?
I know that sound is recorded by taking a whole bunch of samples per second. How can I manipulate these samples to allow me to average/blur a collection of samples?

I'm using windows.

Thanks.
Last edited on
Uhh... you're going to have to find out if there's a specification for microphone input; otherwise, you'll have to interface with the device driver. Do you know how to process the samples to detect the pitch already?
I was thinking I would take a sample of what it needs to detect, and then just go with whatever that gives me.

Something like this.
1
2
3
4
5
6
7
8
9
int main()
{
    sound sample;
    while (1)
        {
            sample=getsound();
            cout<<"Pitch: "<<sample.getpitch()<<" Volume: "<<sample.getvolume()<<"\n";
        }
}


Then once I know the volume and pitch, I could set up some loops to wait for the sound and take action accordingly.
Last edited on
You seem to have a very idealistic view of how sound on a computer works. You need
a) knowledge of how applications access the sound driver
b) something like SDL (Simple Directmedia Layer)
b) enough experience to write a kernel driver.

I'd go with b. Look up SDL on the internet.
I'm having trouble finding out how to get microphone input with SDL.
Well, a microphone is literally a reverse speaker. If you can find out how to use one, you can find out how to use the other. By the way, a microphone would have it's own driver, with which you will need to communicate. Otherwise, you're not going to get anywhere.
I know nothing about using drivers. Where could I learn?
I know that sound is recorded by taking a whole bunch of samples per second. How can I manipulate these samples to allow me to average/blur a collection of samples?


This is a far more complicated question than you realize. Go look up digital signal processing on wikipedia or something.

Basically, the way audio works is you have a series of samples taken every so often. If you make a graph of these samples, you'd use time as the X axis, and the sample as the Y axis. You then "connect the dots" to create the sound wave.

Taking a handful of samples and trying to figure out what kind of sound it makes is very very very very complicated.

EDIT:

I just re-read your original post. If you want to determine the pitch of a sound, you'll need to do a Fourier transform. There are libraries that do this. Google for FFT libraries. But again.... even with a lib... it's more complicated than you think.

As for the rest of your question:

A quick skim of SDL documentation suggests it doesn't have support for audio input (only audio output).

You'll have to use another library.

On Windows, I know you can use waveIn, but it's a complicated process.

1) Open an input device with waveInOpen
2) Prepare one or a few buffers with waveInPrepareHeader
3) Give those buffers to the output device with waveInAddBuffer
4) Start recording with waveInStart
5) Audio is recorded and fills your buffers
6) Continue to provide additional buffers as needed for as long as you want to record audio
7) Stop recording with waveInStop
8) Free buffers with waveInUnprepareHeader
9) Close input device with waveInClose


None of the steps are trivial. If you're really interested in learning you can look it up on msdn here:

http://msdn.microsoft.com/en-us/library/aa908147.aspx <-- link to waveIn documentation.


Personally, I would start with waveOut and audio output streaming since it's more or less the same process -- but easier to figure out that you're doing it right/wrong and therefore easier to diagnose. I say try streaming a .wav file with waveOut first... and once you can do that successfully, then try to record something.


I've done lots of work with waveOut in a previous life (it's been years), so I have a pretty good grasp of it if you have additional questions. I never actually used waveIn, but from the documentation it looks like it's pretty much the exact same idea.

Also I'm not on Windows so unfortunately I won't be able to test things out for you, so you'll largely be on your own (unless someone else on here can help).

So yeah....

EDIT2:

needless to say, this probably isn't a task for a beginner (I just realized this is the beginner's forum!). Audio streaming demands realtime attention, sometimes multithreading and thread safety, and other advanced programming concepts.

There might be a lib out there that makes audio recording simpler. Try googling for audio recording libs. But note.... any lib you find will only record PCM data (ie: samples). Nothing will tell you what tones are playing -- you'll have to run the samples through an FFT and all that jazz to figure that out.
Last edited on
Thanks Disch.
We learned a lot about waves in physical science last year. If this is true:
Basically, the way audio works is you have a series of samples taken every so often. If you make a graph of these samples, you'd use time as the X axis, and the sample as the Y axis. You then "connect the dots" to create the sound wave.

I'm pretty sure I can do it.
Last edited on
rock on.

Then yeah.. check out waveIn. Lemme know if you have Qs and I'll see if I can help.
Topic archived. No new replies allowed.