Windows Audio

Pages: 12

I followed your advice and checked SDL for my project. The tutorials and walkthroughs were very helpful in the development of my project. But I decided to drop SDL, since it doesn't fit very well when trying to develop a Windows app. It's portable, but portability is not an issue, since my project targets the WIN32 platform. I decided to stick with GDI+.
Unfortunately with my decision to drop SDL I lost my way of handling audio. I was wondering if any of you guys could tell me a Windows-specific way of handling audio. The GDI+ of audio, if you will.

Disch (13742)

DirectSound or waveOut

DirectSound will get you lower latency.

waveOut is easier to use (IMO -- but it depends a lot on what you want to do).

http://msdn.microsoft.com/en-us/library/aa910393.aspx

basic steps:

- waveOutOpen to open the audio device
- waveOutPrepareHeader to "register" memory buffers (in the form of WAVEHDRs) which you'll use to output audio
- waveOutWrite to output audio
- waveOutUnprepareHeader to clean up WAVEHDRs
- waveOutClose to close the audio device

LoLFactor (76)

And how do I process the actual .wav files?

Last edited on

Disch (13742)

http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/WAVE.html

It's not really as complicated as that page makes it look. Just check out the "PCM Data" Example.

It'll start with the RIFF header
then you find the 'fmt ' chunk which has the samplerate and stuff
then you have the 'data' chunk which has the PCM data.

LoLFactor (76)

Ok..I managed to read the Wave file and send it to the device, but nothing happens...as in no sound. Do I have to send each block at a time or can I send the whole data at once? And are there any other details that I should know about the process?

Disch (13742)

.as in no sound.

Are any of the functions returning an error?

Can you show me some code?

Do I have to send each block at a time or can I send the whole data at once?

You can do either. Sending a block at a time (streaming) is a bit trickier but uses less memory.

And are there any other details that I should know about the process?

Welll there's always more details to know .. but you can get by with this and just figure the rest out as you go.

LoLFactor (76)

The code isn't all that tidy... Sorry for that, but it was for testing purposes, to see if I got it right.


#include <Windows.h>
#include <strsafe.h>
#include <cstdio>

struct RIFFCHUNK {
	UCHAR	lpszName[4];
	DWORD	dwSize;
};

#define RECAST(x) reinterpret_cast<x>
#define STCAST(x) static_cast<x>

int main() {
	HANDLE hWaveFile;
	RIFFCHUNK rcChunk;
	WAVEFORMATEX wiInfo;
	DWORD dwBytesOut;
	LPTSTR lpszError[256];
	UCHAR* lpAudioData;
	hWaveFile = CreateFile(L"C:\sound.wav", GENERIC_ALL, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN, NULL);
	SetFilePointer(hWaveFile, 12, 0, FILE_BEGIN);
	ReadFile(hWaveFile, RECAST(LPVOID)(&rcChunk), STCAST(DWORD)(sizeof(RIFFCHUNK)), &dwBytesOut, NULL);
	SetFilePointer(hWaveFile, wiInfo.cbSize, 0, FILE_CURRENT);
	do{
		ReadFile(hWaveFile, RECAST(LPVOID)(&rcChunk), STCAST(DWORD)(sizeof(rcChunk)), &dwBytesOut, NULL);
	}while(lstrcmpiA(RECAST(LPSTR)(rcChunk.lpszName), "data") == 0);
	lpAudioData = RECAST(UCHAR*)(HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, rcChunk.dwSize - 8));
	ReadFile(hWaveFile, lpAudioData, rcChunk.dwSize, &dwBytesOut, NULL);
	HWAVEOUT wo;
	WAVEHDR whdr;
	whdr.dwBufferLength = rcChunk.dwSize - 8;
	whdr.lpData = RECAST(LPSTR)(lpAudioData);
	whdr.dwFlags = 0;
	waveOutOpen(&wo, WAVE_MAPPER, &wiInfo, NULL, NULL, CALLBACK_NULL);
	waveOutPrepareHeader(wo, &whdr, sizeof(WAVEHDR));
	waveOutWrite(wo, &whdr, sizeof(WAVEHDR));
	waveOutUnprepareHeader(wo, &whdr, sizeof(WAVEHDR));
	waveOutClose(wo);
return 0;
}

LoLFactor (76)

So, anyone catch the problem? :(

Disch (13742)

sorry, for some reason I totally didn't see your post until you bumped it.

Nothing stands out to me as being obviously wrong. The only thing I notice is that you're closing waveOut right away.

waveOutWrite is a nonblocking call. It doesn't wait for the entire sound to play, it just starts playing it and then your program keeps running. If you close waveOut immediately afterward (which is what it looks like here) then you're not giving the sound any time to play.

LoLFactor (76)

Ok, so I put a Sleep call between these two lines

1
2

waveOutUnprepareHeader(wo, &whdr, sizeof(WAVEHDR));
waveOutClose(wo);

and it didn't work. Then I thought maybe it's something else, so I checked it out and saw that the code is failing at this line

waveOutOpen(&wo, WAVE_MAPPER, &wiInfo, NULL, NULL, CALLBACK_NULL);

, but it's not actually failing, because it doesn't even return a value, it just crashes the app. Any ideas? :-S

Disch (13742)

Well you would sleep before the waveOutUnprepareHeader line. Calling waveOutUnprepareHeader would sort of kill the audio buffer. It's called as part of shutdown.

Anyway, I don't see where you're actually specifying the audio format. I mean you have this WAVEFORMATEX wiInfo;, but I don't see where you're actually filling it / reading it from the file. Perhaps that's why waveOutOpen is crashing?

LoLFactor (76)

This is my code now.(BTW I accidentally erased the line that read wiInfo last time I posted code)

#include <Windows.h>
#include <cstdio>

struct RIFFCHUNK {
	UCHAR	lpszName[4];
	DWORD	dwSize;
};

#define RECAST(x) reinterpret_cast<x>
#define STCAST(x) static_cast<x>

int main() {
	HANDLE hWaveFile;
	RIFFCHUNK rcChunk;
	WAVEFORMATEX wfInfo;
	DWORD dwBytesOut;
	LPTSTR lpszError[256];
	UCHAR* lpAudioData;
	
	hWaveFile = CreateFile(L"F:\sound.wav", GENERIC_ALL, NULL, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN, NULL);
	SetFilePointer(hWaveFile, 12, 0, FILE_BEGIN);
	ReadFile(hWaveFile, RECAST(LPVOID)(&rcChunk), STCAST(DWORD)(sizeof(RIFFCHUNK)), &dwBytesOut, NULL);
	ReadFile(hWaveFile, RECAST(LPVOID)(&wfInfo), STCAST(DWORD)(sizeof(WAVEFORMATEX)), &dwBytesOut, NULL);
	
	SetFilePointer(hWaveFile, wfInfo.cbSize, 0, FILE_CURRENT);
	do{
		ReadFile(hWaveFile, RECAST(LPVOID)(&rcChunk), STCAST(DWORD)(sizeof(rcChunk)), &dwBytesOut, NULL);
	}while(lstrcmpiA(RECAST(LPSTR)(rcChunk.lpszName), "data") == 0);
	lpAudioData = RECAST(UCHAR*)(HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, rcChunk.dwSize - 8));
	
	ReadFile(hWaveFile, lpAudioData, rcChunk.dwSize, &dwBytesOut, NULL);

	HWAVEOUT wo;
	WAVEHDR whdr;
	MMRESULT mmr;

	whdr.dwBufferLength = rcChunk.dwSize - 8;
	whdr.lpData = RECAST(LPSTR)(lpAudioData);
	whdr.dwFlags = 0;

	mmr = waveOutOpen(&wo, WAVE_MAPPER, &wfInfo, NULL, NULL, CALLBACK_NULL);
	if(mmr != MMSYSERR_NOERROR) {
		std::printf("MMERROR: %d", mmr);
		ExitProcess(1);
	}
	mmr = waveOutPrepareHeader(wo, &whdr, sizeof(WAVEHDR));
	if(mmr != MMSYSERR_NOERROR) {
		std::printf("MMERROR: %d", mmr);
		ExitProcess(2);
	}
	mmr = waveOutWrite(wo, &whdr, sizeof(WAVEHDR));
	if(mmr != MMSYSERR_NOERROR) {
		std::printf("MMERROR: %d", mmr);
		ExitProcess(3);
	}

	Sleep(3000);
	mmr = waveOutUnprepareHeader(wo, &whdr, sizeof(WAVEHDR));
	if(mmr != MMSYSERR_NOERROR) {
		std::printf("MMERROR: %d", mmr);
		ExitProcess(4);
	}
	waveOutClose(wo);
return 0;
}

Still, waveOutOpen fails returning WAVERR_BADFORMAT, which means it can't find a device that plays that type of audio. I've tried different files with different formats, but it won't work. :|

Disch (13742)

I'm not a fan of reading structs in full like that, as you're never 100% sure it's reading them properly. (it depends on how the compiler decides to lay out the struture in memory. Padding bytes and whatnot can rip you apart)

Plus... sizeof(WAVEFORMATEX) might not be the size of the "info" chunk in the header, so you might be reading too much / too little data.

PLUS, the "info" chunk might not be immediately after the RIFF header like you appear to be assuming (there might be a comment chunk or something before it).

The first thing I'd do here is dump/print the contents of your wfInfo struct and make sure that you're getting the values you expect.

OR, as another test, you can just manually fill out the wfInfo struct rather than reading it from the file.

LoLFactor (76)

Well, it's actually the "fmt" chunk and IT IS after the "RIFF" chunk, which is exactly 12 bytes long. It says so in the specs you sent me (thx, btw :D). As for dumping the data, no need, i debugged the app and put a few watches on the variables. It's reading everything properly.

Disch (13742)

I'll have to take a look at this when I get off of work.

Can you post the WAVEFORMATEX data in here for my reference? One thing I can think of is that the 'format' or 'type' (or whatever that member is called) is wrong. IIRC, it must be == 1.

choisum (97)

I had a similar problem sometime ago, I think WaveOut (or whatever I was using), did not mix sounds properly. I had to close WinAmp for the sound to appear.

Don't know if that's the problem, just thought I'd mention it, since everything else appears to check out fine.

Disch (13742)

After checking the docs, I really have to stick with my gut that the problem is with the WAVEFORMATEX structure not being filled properly. It might be something as simple as the cbSize member being incorrect.

You'll have to post the contents of your wfInfo struct for me to be sure... but that's really the only thing that could cause a BADFORMAT error afaik.

Try not reading the WAVEFORMATEX struct from the file, and instead set its members manually. Just as a test to see if you can get audio working.

Here's how you'd do it properly:

WAVEFORMATEX wfx;

// zero the struct before using it.
//   in case there are additional members -- this way they
//   will have no affect
memset(&wfx,0,sizeof(WAVEFORMATEX));

// the following members must be set:
wfx.cbSize = sizeof(WAVEFORMATEX);    // the size of the structure
wfx.wFormatTag = WAVE_FORMAT_PCM;     // or = 1 (same thing).  Normal PCM audio
wfx.nChannels = 2;                    // 2 for stereo, or 1 for mono
wfx.nSamplesPerSec = 44100;           // or whatever the samplerate is
wfx.wBitsPerSample = 16;              // or maybe 8 if you have an 8-bit audio file.

// the rest should be calculated from the above:
wfx.nBlockAlign = wfx.nChannels * wfx.wBitsPerSample / 8;
wfx.AvgBytesPerSec = wfx.nBlockAlign * wfx.nSamplesPerSec;

// 'wfx' is now good to use

LoLFactor (76)

In WAVEFORMATEX cbSize is not the size of the structure, but the size of extra info in the file's "fmt" chunk. That's why I have this line of code

SetFilePointer(hWaveFile, wfInfo.cbSize, 0, FILE_CURRENT);

to skip that extra info, which is unnecessary to me.

But you were right, I tried it your way and it worked. One problem though, i can't actually do this in my project because i still have to get to the data chunk, which requires passing through all the others. Even if I convert all the audio to be of the exact same format, I still have to parse the files. Any idea how to fix that?

Disch (13742)

In WAVEFORMATEX cbSize is not the size of the structure, but the size of extra info in the file's "fmt" chunk.

I just double checked, and you are correct. That should probably be zero then.

But you were right, I tried it your way and it worked. One problem though, i can't actually do this in my project because i still have to get to the data chunk, which requires passing through all the others. Even if I convert all the audio to be of the exact same format, I still have to parse the files. Any idea how to fix that?

Well again, it would really be easier to tell exactly what was going wrong if you posted what you were getting for wfInfo when you were reading it from the file =P. Since I'm working half blind here I kind of just have to guess.

But my guess is either:

1) The data was being read from the file incorrectly (reading to a struct directly always makes this a possibility -- hence why I don't encourage it)

and/or

2) The data as stored in the file doesn't have quite the same meaning as the data stored in the struct you need to pass to waveOutOpen. Possibly the nBlockAlign or nAvgBytesPerSec fields are wrong in the file or something dumb like that.

I would recommend you just take the necessary data from the "fmt " chunk and discard the rest. Something like:

// assume you at the start of the "fmt " chunk here
//  assume wfInfo is zero'd
wfInfo.wFormatTag = file.Read2Bytes();  // make sure this value is == WAVE_FORMAT_PCM.
                          //I don't know if waveOut can handle anything else
wfInfo.nChannels = file.Read2Bytes();
wfInfo.nSamplesPerSec = file.Read4Bytes();
file.Skip6Bytes();  // skip over avg bytes per sec and block align.  Should be calculated
wfInfo.wBitsPerSample = file.Read2Bytes();
file.SkipRestOfChunk();

// calculate nBlockAlign and nAvgBytesPerSec as per above

LoLFactor (76)

Ok, so I tried reading them one by one. Using this:http://simplythebest.net/sounds/WAV/sound_effects_WAV/sound_effect_WAV_files/alarm_clock.zip sample wave file I skip the first 12 bytes and read the "fmt " riff header which reports a chunk size of 30(odd, I know, but that's just how it is). I then read the attributes which are, in order:
wFormatTag: 85
nChannels: 1
nSamplesPerSec: 11025
skip 6 bytes
wBitsPerSample: 0
I then skip 6 more bytes(fmt size, which is 30 - 8(the riff header) - 16(the data)) to where the "data" or "fact" chunk should be. But when I try to read the next riff header, it reads some random stuff, which leads me to believe that the cursor is out of place(actually, I'm quite sure).

Pages: 12