Wave (.wav) samples

Hi everyone,

I am trying to do something with sound in C++. The C++ standard doesn't include sound, so I'm using a nice Windows API (don't scream that I need to be on the Windows forum for that yet) which uses WAVE (.wav) files. I understand it, but I don't get the 'WAVE-samples' in the data sub chunk.
(see https://ccrma.stanford.edu/courses/422/projects/WaveFormat/ for more info)
I understand that the 16-bit integer samples (in the example on that page) are arranged left-right for the different channels. My question is, what do these two-byte numbers actually mean? Do they represent the frequency of a sine wave? (how do you save multiple notes than?) Are they the coordinates of a piont in a sinewave diagram? (If so, where are the y and the x than?) Are these numbers there just to draw our attention from the tiny green martians in the sound card who actually sing a song of which the title is encrypted in the header?
I DON'T GET IT!!!

Please reply,
Niels Meijer
They're pointed plots of a sound wave.

Imagine a grid.

The top of the grid is 32767
The bottom of the grid is -32768
The center line of the grid is 0

The left side of the grid is 0
The right side of the grid is boundless (how far it goes depends on how long the tune plays)


The X axis is time.
The Y axis is the output at that particular point in time.

Each sample plots a point on that grid. The X coord is which sample index it is, the Y coord is the actual value of the sample.

A simple example:

if you have the samples 0, 1, 2, 2, 2, -1, -3 it might look like this:

 2:   ###
 1:  #
 0: #
-1:      #
-2:
-3:       #


Connect the dots and it forms a sound wave. This translates roughly into the physical motion that the speakers will perform in order to recreate the sound.



EDIT:

As for how it can contain multiple tones, that's a whole other topic. I can get into the basics if you're really interested.
Last edited on
The values are just the amplitude of the signal over time. Pretty much the y coordinate. The x coordinate is time.

If you sample the audio at 44100 Hz, that means you record the amplitude (get a y coordinate) 44100 times per second. Sound waves are just compositions of sin wavs of different frequencies.

Frequency is just how many times a sin wav oscillates per second.

If you want to generate a note, you just need to plot a sin wav with the right wave length. If you want two different notes, then just plot two appropriate sin waves and sum their corresponding sample values. The amplitude of the sin wav is obviously corresponds to loudness. A standard A is 440 Hz.

Depending on how your storing your samples you might use different ranges of values to represent the amplitude. For integer types, you usually represent the range as the range of the actual type. For example, a 16 bit integer can store 2^16 different values. A signed 16 bit integer lets half of this range be negative giving you a range of -32,768 through 32,767. If you go over this, you'll be clipping. You might also store audio samples in floating point variables like a float or a double. In this case your range is typically -1.0 to 1.0. It's preferable to process audio samples stored in floating point types. It will have to be converted to what integer in whatever bit your target wav file will be.

The sound card converts the digital signal to an analog signal so it can send it to your amplifier and play it out of your speakers.

If you want to use a much better library for reading sound files, I recommend libsndfile.
Last edited on
y ( x ) = amplitude * sin ( 2 * π * frequency * x + fase )

That's the formula of a sinusoid. What should I take as parameters if I wanted to play iseeplusplus regular A (440 Hz ) with 8-bit samples? Does the sampling rate matter in here or is it just the quality of the sound?

Kind regards,
Niels Meijer
1
2
3
    for(int i = 0; i< bufferSize; ++i) {
        y[i] = amplitude * sin( (2.0 * PI * frequency) / sampleRate * i );
    }


Try this.
Last edited on
for starters, 8 bit sucks not only because the quality is horrid but also because it's unsigned which makes it harder to work with.


Here's the 16-bit version (untested):
1
2
3
4
5
6
7
int amp = 0x2000; // higher or lower for volume  do not exceed 0x7FFF
const double samplerate = 44100;
const double tonehz = 440;

samples[ sampleindex ] = static_cast<int16_t>(
    amp * sin( sampleindex / samplerate * tonehz * 2 * pi )
         );



Here's the 8-bit version (untested):
1
2
3
4
5
6
7
int amp = 0x20; // higher or lower for volume  do not exceed 0x7F
const double samplerate = 44100;
const double tonehz = 440;

samples[ sampleindex ] = static_cast<uint8_t>(
    amp * sin( sampleindex / samplerate * tonehz * 2 * pi )
         ) ^ 0x80;



EDIT: ninja'd =(
Last edited on
Topic archived. No new replies allowed.