Standard deviation function refuses to work :(

I have already spent about an hour on this piece of code. There must be some sneaky bugs lurking somewhere. I implore you, o Wise One, please enlighten this ignorant newbie...


A bunch of integers are given as a vector, and their mean is known. I want to calculate their standard deviation.


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

double sd(const std::vector<int> &results, double mean)
{
    int n = results.size();   // getting the sample size


//  summing over  (deviation)^2

    double temp = 0;

    for(int i = 0; i < n; i++)
    {
        temp += ((double) results[i] - mean)*( (double)results[i] - mean);
    };


//  this is the variance

    double temp2 = temp / (double)(n-1);


//  take sqrt for standard deviation

    return std::sqrt(temp2);
};




This code gave rubbish when I plugged it into my main program. I then tested it using random integers between [0, 10], and it always returns 1 no matter what...

(The mean has been fluctuating around 5, so I know the random number is working.)
Can you give an example? What output do you get and what output did you expect?
Looks fine to me. Perhaps you're not calling the function correctly, or storing the output in an int, or some other such error.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <vector>
#include <cmath>
#include <iostream>

using namespace std;


double sd(const std::vector<int> &results, double mean)
{
    int n = results.size();   // getting the sample size


//  summing over  (deviation)^2

    double temp = 0;

    for(int i = 0; i < n; i++)
    {
        temp += ((double) results[i] - mean)*( (double)results[i] - mean);
    };


//  this is the variance

    double temp2 = temp / (double)(n-1);


//  take sqrt for standard deviation

    return std::sqrt(temp2);
};

int main()
{
  vector<int> input(8);
  input[0] = 0; input[1] = 1; input[2] = 2; input[3] = 3; input[4] = 4; input[5] = 5; input[6] = 6; input[7] = 7;
  cout << sd(input, 3.5);
}



I do not see any problem with the code. Only it could be written more simply. For example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
double sd( const std::vector<int> &results, double mean )
{
	const std::vector<int>::size_type n = results.size();

	double temp = 0;

	for( std::vector<int>::size_type i = 0; i < n; i++ )
	{
		temp += ( results[i] - mean ) * ( results[i] - mean );
	}


//  take sqrt for standard deviation

	return ( std::sqrt( temp / ( n - 1 ) ) )l;
}


You should 1) look values of the vector; 2) look the return value

For example
1)
for ( auto x : results ) std::cout << x << ' ';
std::cout << std::endl;

2)
std::cout std::sqrt( temp / ( n - 1 ) ) << std:;endl;

I think that the problem is somewhere else in your program.
Last edited on
1)
for ( auto x : results ) std::cout << x << ' ';
std::cout << std::endl;


That's if his compiler supports range-based for loops.
I generate random integer between [ 0, (max -1) ] using:

1
2
3
4
5
6
7
8

    vector<int> samples;

    for(int i = 0; i < sample_size; i++)
    {
        samples.push_back( ( rand()%max ) );
    };



The mean is calculated with:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

double mean(const std::vector<int> &samples)
{
    // get sample size
    int n = samples.size();


    // add everything
    int temp = 0;

    for(int i = 0; i < n; i++)
    {
        temp += results[i];
    };


    // taking average
    return ((double) temp / n);
};



Regardless of what I use for max and sample_size, I always get 1 for standard deviation.


My expectation? Isn't it common knowledge that:

standard deviation ~ max / sqrt(sample_size) ?
Can you show how you call the sd function?
--edit--

Code is fixed.

The renaming of variables was to clean up some less-than-decent language included in the code out of my frustration. And well that inevitably introduced typos...

--edit--


Here's m main():


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

int main()
{
    int max;

    cout << "Random integer variable between 0 and: " << endl;
    cin >> max;

    max++;  // i want the range to be [0, max]



    srand( time(0) );



    int sample_size;

    cout << "sample size?" << endl;
    cin >> sample_size;



// generating the sample

    vector<int> samples;

    for(int i = 0; i < sample_size; i++)
    {
        samples.push_back( ( rand()%max ) );
    };


//  calling the functions

    double avg = mean(samples);
    double error = sd(samples, avg);



// output to console

    cout << "Mean is " << avg << endl;
    cout << "Uncertaity is " << sd << endl;

    return 0;

};
Last edited on
Please post real code. Your gives compiler errors. You use results instead of samples in mean and in main you do i < samples.
Last edited on
@wohtp


Did you read what I wrote?! So, please, do not bother any more the forum until you did not check values of the vector and the return value in your function.

I can suspect that values of the vector are not the same as you are assuming. Maybe you should substitute std::vector<int> for std::vector<double>
Last edited on
cout << "Uncertaity is " << sd << endl;

sd is a function. I think you meant to print the error variable.


Also, you don't need to have semicolon after the function definitions.
Last edited on
Topic archived. No new replies allowed.