Am I understanding this code correctly?

This approach may be unconventional. In the code submitted I commented what I understood and any questions or areas I was confused about. Thanks to anyone who is patient enough to read through it and provide some guidance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
#include <windows.h> 
#include <iostream> 
#pragma comment(lib, "winmm.lib")
using namespace std;

int main(){

// I understand what's going on here. Open up the file.
// I was told to use CreateFile() instead, oops. I will fix later.
HMMIO handle = mmioOpen(L"C:\\WINDOWS\\MEDIA\\TADA.WAV", 0, MMIO_READ);

// Create ChunkInfo structure to store the RIFF chunk info 
MMCKINFO ChunkInfo;

// My guess is this is to allocate memory for the ChunkInfo structure.
// Why is this only done here, and only for ChunkInfo, or am I 
// really confused on what's going on here??
memset(&ChunkInfo,0, sizeof(MMCKINFO));

// Descend into the RIFF chunk and store the info in the 
// ChunkInfo structure
mmioDescend(handle, &ChunkInfo, 0, MMIO_FINDRIFF);

// I wanted to see what these were.
cout<<"ChunkInfo.ckid: " <<ChunkInfo.ckid<< endl;
cout<<"ChunkInfo.cksize: " <<ChunkInfo.cksize<< endl;
cout<<"ChunkInfo.dwDataOffset: " <<ChunkInfo.dwDataOffset<< endl;
cout<<"ChunkInfo.dwFlags: " <<ChunkInfo.dwFlags<< endl;
cout<<"ChunkInfo.fccType: " <<ChunkInfo.fccType<< endl;

// Create another structure to store the fmt chunk info
MMCKINFO FormatChunkInfo;

// Assign the ckid of FormatChunkInfo to the result of mmioStringToFOURCC()
FormatChunkInfo.ckid = mmioStringToFOURCC(L"fmt", 0);

// Descend into the fmt chunk, whose parent chunk is RIFF.
// The flag MMIO_FINDCHUNK says to find the chunk with the ckid "fmt".
mmioDescend(handle, &FormatChunkInfo, &ChunkInfo, MMIO_FINDCHUNK);

// Create another structure waveFmt to store the wav format info.
WAVEFORMATEX waveFmt;

// Not sure how this is playing out.
// handle is the handle of the file to be read.
// (char*)&waveFmt is the pointer to a buffer to contain the data read from the file.
// FormatChunkInfo.cksize is being used as the number of bytes to read from the file.
// Pointers throw me off sometimes. How is this a pointer (char*)&waveFmt?
// Is it casting the address of waveFmt as a char* (pointer to a char)??
mmioRead(handle, (char*)&waveFmt, FormatChunkInfo.cksize);

// Same as before creating a structure to store the DataChunkInfo
MMCKINFO DataChunkInfo;

// Because we descended deep into the RIFF chunk, 
// we need to ascend out of it before we descend to a new chunk.
mmioAscend(handle, &FormatChunkInfo, 0);

// Assign the ckid of DataChunkInfo to the result of mmioStringToFOURCC()
DataChunkInfo.ckid = mmioStringToFOURCC(L"data", 0);

// Descend into the data chunk, whose parent chunk is RIFF.
// The flag MMIO_FINDCHUNK says to find the chunk with the ckid "data".
mmioDescend(handle, &DataChunkInfo, &ChunkInfo, MMIO_FINDCHUNK);

// Assigning the DataChunkInfo.cksize to a new int size.
unsigned int size = DataChunkInfo.cksize;

// Using "size" to allocate memory for the data to be read into.
char* data1 = new char[size];

// Read from the file, from the chunk we've descended to into data1
mmioRead(handle, data1, size);

// Close the file
mmioClose(handle, 0);

// I tried to use cout to read the data stored in data1
// but got a bunch a big mess of crazy symbols.

system("pause");
return 0;
}


Thanks!
Chris
1
2
3
4
// My guess is this is to allocate memory for the ChunkInfo structure.
// Why is this only done here, and only for ChunkInfo, or am I 
// really confused on what's going on here??
memset(&ChunkInfo,0, sizeof(MMCKINFO));
memset(), as its name suggests, sets memory. In this case, it zeroes the memory of the structure.

1
2
3
4
5
6
7
// Not sure how this is playing out.
// handle is the handle of the file to be read.
// (char*)&waveFmt is the pointer to a buffer to contain the data read from the file.
// FormatChunkInfo.cksize is being used as the number of bytes to read from the file.
// Pointers throw me off sometimes. How is this a pointer (char*)&waveFmt?
// Is it casting the address of waveFmt as a char* (pointer to a char)??
mmioRead(handle, (char*)&waveFmt, FormatChunkInfo.cksize);
Treat waveFmt as an array of bytes (a buffer) and read from the file some number of bytes into said buffer. This is the traditional method to serialize data in C.

I tried to use cout to read the data stored in data1
but got a bunch a big mess of crazy symbols.
Unsurprisingly. It's a sound file, so it contains binary data.
1
2
I tried to use cout to read the data stored in data1
but got a bunch a big mess of crazy symbols.

Unsurprisingly. It's a sound file, so it contains binary data.


@helios:
Thanks for the quick response.

@everyone else:
How do I get it to a data type that I can read? For instance if I wanted to use the data as an int or a float?

Not sure if I'm getting this... let's see. I added this bit to read through the buffer "data1". Last time I tried this I got a bunch of crazy symbols. Helios informed me that it was because it was binary data. So I decided to cast this data as an int. It seemed to work. I got a series of signed ints. I'm not sure if this is representative of the actual wav.


1
2
3
4
for (int i=size, x = 0; i > 0; i--, x++){

cout<< (int)data1[x]<<endl;
}
For instance if I wanted to use the data as an int or a float?


You mean, for instance, if you have an long integer that is equal to 10, you want to read the "float version" of it as 10.000000? Or you want to interpret the memory differently? The first, your correctly using casting. As for the second, I've been told to use what is called unions.
Probably not. WAV can store samples in a wide variety of formats, most of which are at least 16-bit wide.

This subject is much too broad to cover here. The gist of it is that the number, say, 48879 can be stored as EF BE (little endian) or BE EF (big endian). 48879 is an unsigned number, but EF BE can also be interpreted as a signed number: -16657 if treated as little endian, or -4162 if treated as big endian. And so on for bigger integers.
The WAV should say in its header what representation it's using.
The WAV should say in its header what representation it's using.


What part of the header?
How should I know? Somewhere.
It's most likely PCM data, meaning each sample is represented with a 16 bit number (a short). But in your example, your reading into, char* data1 = new char[size];.

From the example I posted.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
    
typedef struct
{
    SHORT LEFT;
    SHORT RIGHT;} SAMPLE;

...

LONG getnextsample(WAVEIN wav, SAMPLE* pSample)
    {
        LONG ret = mmioRead( wav -> WAVEHANDLE , (HPSTR) &pSample->LEFT, wav ->WAVEFORMAT.wBitsPerSample / 8);

        ret = mmioRead(wav-> WAVEHANDLE , (HPSTR) &pSample->RIGHT, wav ->WAVEFORMAT.wBitsPerSample / 8);
        return ret;
    }

//could fill a vector with all the samples like this.  
vector <short> SL;
vector <short> SR;
while (getnextsample(wav, &sample) > 0){
    SL.push_back(Sample->RIGHT); 
    SR.push_back(Sample->LEFT);
}


I don't have much of a clue about endianess. You should have sample values between -32768 and 32767. Note again that I think you the short data type, not char, to hold the samples.

Also, I didn't write this portion of the code, and I don't know what HPSTR, but you can see the author has this in the function,

ret = mmioRead(wav-> WAVEHANDLE , (HPSTR) &pSample->RIGHT, wav ->WAVEFORMAT.wBitsPerSample / 8);

This example extracts a sample at a time. And note that the samples are interleaved.

The last parameter in this case, accesses the member of the WAVEFORMATEX struct, to get the number bits used to store each sample, and divides by 8 getting the amount of bytes.

So you notice you can tell how many bits per sample and get the format tag in your code like this:

1
2
cout << waveFmt.cBitsPerSample << " bit";
cout << waveFmt.wFormatTag;



See what happens when you change this part of your code to look like this.

1
2
3
4
5
6
7
unsigned int size = DataChunkInfo.cksize;

// Using "size" to allocate memory for the data to be read into.
short* data1 = new short[size];

// Read from the file, from the chunk we've descended to into data1
mmioRead(handle, (HPSTR) data1, size);


EDIT:
Also you should note that DataChunkInfo.cksize gets you the size in bytes of the data. There are 2 bytes in 16 bits. So you can tell that there are DataChunkInfo.cksize / 2 , samples.

So I think you actually need to only allocate
short *data1 = new short [size / 2];
Last edited on
Thanks everyone.

@iseeplusplus: I made the changes you suggested. I also added a couple of changes to the code. I only added the bottom portion of the code where the changes took place. I also left comments in the code explaining what I think is going on in it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Assigning the DataChunkInfo.cksize to a new int size.
unsigned int size = DataChunkInfo.cksize/2;

// Using "size" to allocate memory for the data to be read into.
short* data1 = new short[size];

// Read from the file, from the chunk we've descended to into data1
mmioRead(handle, (HPSTR)data1, size);

// Close the file
mmioClose(handle, 0);


// Create a loop to assign the left and right channels.
// Convert the data of each channel to a float value
// between -1.0 and 1.0 (done for vst development)
for (int i=size, x = 0; i > 0; i--, x++){
short left, right;
float leftNorm, rightNorm;

// Assign the buffer to left and right channels
left = data1[x];
x++;
right = data1[x];

// Convert left channel to a float value between -1.0 and 1.0
if (left >= 0)leftNorm = (float)left/32767;
else leftNorm = (float)left/32768;

// Convert right channel to a float value between -1.0 and 1.0
if (right >= 0) rightNorm = (float)right/32767;
else rightNorm = (float)right/32768;

// Print out to the screen the adjusted values
cout<< leftNorm<< ", "<< rightNorm<< endl;
}

system("pause");
return 0;
}


This appeared to work. Does this make sense to anyone else?

Once again, thanks.

Chris
1
2
3
4
5
6
7
// Convert left channel to a float value between -1.0 and 1.0
if (left >= 0)leftNorm = (float)left/32767;
else leftNorm = (float)left/32768;

// Convert right channel to a float value between -1.0 and 1.0
if (right >= 0) rightNorm = (float)right/32767;
else rightNorm = (float)right/32768;


Your values should be in a range of -32768 to 32767, so when you check
if left >= 0, you would miss the negative values.
1
2
3
// Convert left channel to a float value between -1.0 and 1.0
if (left >= 0)leftNorm = (float)left/32767;
else leftNorm = (float)left/32768;


I thought this code was roughly saying:
If the left channel's value is positive divide by 32767 and assign it to leftNorm, else left is negative divide by 32768.

If I'm incorrect, what is my code saying?
I thought this code was roughly saying:
If the left channel's value is positive divide by 32767 and assign it to leftNorm, else left is negative divide by 32768.

If I'm incorrect, what is my code saying?


Your right. Sorry about that.
No worries. I'm a total rookie, and I've already accepted that in most cases I'm incorrect. I just got lucky with this one ;)
I wrote this, but it's wrong. PCM could be in u8, 16, 24, or 32 bit.

It's most likely PCM data, meaning each sample is represented with a 16 bit number (a short). But in your example, your reading into,
I wrote this, but it's wrong. PCM could be in u8, 16, 24, or 32 bit.


I was wondering about this. Currently I'm practicing with 16 bit PCM which makes things easier. If 16 bit is a short, 24 bit or 32 bit are what?

What I mean is 16 bit fits nicely into a short 16/8=2 and a short is 2 bytes. Awesome!

24/8=3.....What data type is 3 bytes... none that I know of. So what do we use?

32/8=4 So I'm guessing that a data type of int can be used??
Last edited on
I'm used to using libsndfile which takes care of most of the details for you. In libsndfile, you can use any of the these functions to read an audio file:

1
2
3
4
sf_readf_short     // -32768 to -32767
sf_readf_int       // -2147483648 to 2147483647
sf_readf_float     // -1 to 1
sf_readf_double    // -1 to 1 


I tried reading the source code for libsndfile, but it's sort of complex and hard to understand.

The one thing I am wondering now, is how you deal with the range of he variables extending further by one in the negative direction.

I don't think it would be right to divide the negative values by 32768, and the positive values by 32767. -10, and 10 should have the same amplitude, and obviously 0 is 0.

I guess I would divide each side by 32768.
Last edited on
I don't think it would be right to divide the negative values by 32768, and the positive values by 32767. -10, and 10 should have the same amplitude, and obviously 0 is 0.

I guess I would divide each side by 32768.


I was only guessing, I looked at it this way, the "maximum" negative value is 32768 and the "maximum" positive value is 32767.

Using this logic:
the negative -32768 / 32678 = -1
the positive 32767 / 32767 = 1
0 / 32767 = 0.

Like I said, I'm only guessing at how to convert the short into a float value between -1.0 and 1.0.

After doing a quick search, 24 bit audio seems like the odd man out. As in it's a bit tricky to work with. All very interesting, I'd like to figure this out without using libsndfile.
Last edited on
After doing a quick search, 24 bit audio seems like the odd man out. As in it's a bit tricky to work with.

I guess you would use an int, but it would be a waste of memory. I'm not sure what people do about that.

I suppose when you want to convert to and from 24 bit and float/double, you can multiply or divide by 2^24 / 2.
Last edited on
I spoke with some of the VST developers over at the kvr.com community and got this as a reply to my "crazy logic":

Divide by 32768.0, not 32767.0, in both the positive and negative case.

LittleStudios, you're overthinking stuff here. The 16-bit short sample range is asymmetric, and a step of 1 least-significant bit in either the positive side or the negative side should be the same size. Dividing by 32768.0 means that 16-bit data will never go outside the -1.0 to +1.0 normalized float range. The fact that you'll never quite hit +1.0 is not any kind of problem.


I guess it makes sense.
Topic archived. No new replies allowed.