Least squares fit code for quadratic polynomial

Hello,
I am looking for a fast C++ code (or c code will do) that does least squares fit using a quadratic polynomial in 1 variable. I would prefer a self contained code, so that I don't need to link any additional library beside standard libraries.
Thanks!
Last edited on
Uhm, I think you should make one. If I were you, I'd look into synthetic division.
Is the fit linear or non-linear?

Do you need to write the program, or do you just need a program that can do this particular task?

I suppose you could use code from a book like "Numerical Recipes"; there is probably source code posted online somewhere.

Also, there is a Javascript program posted here (Linear Least-Squares Data-fitting Tool):

http://www.akiti.ca/LinLeastSqPoly4.html

Viewing the source code is straightforward; you'd just have to translate it to C or C++.


>> "I am looking for a fast C++ code (or c code . . ."

Regarding the "fast" aspect, I don't know how various programs compare. Why does it matter? They will all do it in the blink of an eye.

If you really want to optimize the program for execution speed, no matter what existing code you start with, you will most likely have to edit it quite a bit to make it specific to just your application (i.e. - eliminate many options, if-conditions, routines for handling non-quadratic polynomials, etc.

Hope this helps.
I was hurrying and wrote it myself. This function is being called 10-70 million of times, therefore I need it to be as fast as possible. Is it possible to optimize smth in findQuadCoefficients f-n?
'main' is just for testing. I believe it works correctly, though any comments are welcome.
In the output I only use coefficient 'a' for x^2, critical point of obtained quadratic polynomial (just realised that I don't need to return 'b' which is the coefficient for x). And constant coeffient not used at all.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
#include <string>
#include <cmath>
#include <iostream>

using namespace std;
inline bool findQuadCoefficients(double timeArray[], double valueArray[],double &a, double &b, double &critPoint, int PointsNum){
	//return false if error
	//a and b are not calculated now

//	cout << "Entering findQuadCoefficients:" << endl;

	const double S00=PointsNum;//points number
	double S40=0, S10=0, S20=0, S30=0, S01=0, S11=0, S21 = 0;
	double value = 0, indexPow2 = 0;
	const double MINvalue = valueArray[0];
	const double MINtime = timeArray[0];
	int index = -1;
	for (int i=0; i<PointsNum; i++ ){
		index = timeArray[i] - MINtime;
		value = (valueArray[i] - MINvalue); //normalizing
//		cout << "i=" << i << " index=" << index << " value=" << value << endl;
		S40+= pow(index,4);
		S30+= pow(index,3);
		indexPow2 = pow(index,2);
		S20+= indexPow2;
		S10+= index;

		S01 += value;
		S11 += value*index;
		S21 += value*indexPow2;
	}


	double S20squared = pow(S20,2);
	//minors M_ij=M_ji
	double M11 = S20*S00;
	M11 -= pow(S10,2);

	double M21 = S30*S00;
	M21 -= S20*S10;
	double M22 = S40*S00;
	M22 -= S20squared;
	double M31 = S30*S10;
	M31 -= S20squared;

	double M32 = S40*S10;
	M32 -= S20*S30;
//	double M33 = S40*S20 - pow(S30,2);

	double discriminant = S40*M11;
	discriminant -= S30*M21;
	discriminant += S20*M31;
//	cout << "discriminant=" << discriminant << endl;
	if (abs(discriminant) < .00000000001) return  false;

	double Da = S21*M11;
	Da -= S11*M21;
	Da += S01*M31;
	a = Da/discriminant;
//	cout << "discriminant=" << discriminant;
//	cout << " Da=" << Da;

	double Db = -S21*M21;
	Db += S11*M22;
	Db -= S01*M32;
	b = Db/discriminant;
//	cout << " Db=" << Db << endl;

	critPoint = -Db/(2*Da) + MINtime; //-b/(2*a)= -Db/discriminant / (2*(Da/discriminant)) = -Db/(2*Da);

	return true;
};

//test

int main (int argc, char *argv[])
{
	double timeArray[9] = {10., 11., 12., 13., 14., 15., 16., 17., 18.};
	double valueArray[9];
	double a=13., b=-1., c=4.;
	double a2=0, b2=0, c2=0;
	double critPoint = 0;

	for (unsigned int i=0; i<9; i++ ){
		valueArray[i] = a*timeArray[i]*timeArray[i] +b*timeArray[i] + c;
	}
	bool success = findQuadCoefficients(timeArray, valueArray, a2, b2, critPoint, 44);
	cout << "success =" << success << " a2=" << a2 << " b2=" << b2 << endl;
}



Each time the call function is called , I copy data points from large arrays to small new array containing just required data points (between 11 and 44 points) with this code (probably this could be optimized as well):

1
2
3
4
5
6
7
8
9
10
11
const unsigned int HalfPoints = PointsNum/2;

	double valueArray[PointsNum];
	double timeArray[PointsNum];
	for (int i=0; i<PointsNum; i++){
		int timeIndex = curIndex - HalfPoints + i;
		valueArray[i] = searchArray[timeIndex];
		timeArray[i] = timeIndex;
	}

	bool success = findQuadCoefficients(timeArray, valueArray, a,b, critPoint, PointsNum);//ax^2+bx+c 
Last edited on
Is the fit linear or non-linear?


not sure what you mean. I am approximating data point (from 11 to 44 per findQuadCoefficients call) using single variable quadratic polynomial.


Do you need to write the program, or do you just need a program that can do this particular task?


its just a subroutine that is called millions of times per program run, which may take 10 hours or more.
"Is the fit linear or non-linear?"

not sure what you mean. . .


If you do a search for linear versus nonlinear data regression you will find a lot of info about the difference.

Also, even if you want a linear fit, keep in mind that some models minimize the 2-norm while others minimize the vertical distance between the points and the approximating curve.

For example, see Figure 2 on the following document:

http://onlinestatbook.com/2/regression/intro.html

The approximating line is the result of minimizing the total vertical distance between each point and the line.

However, some models/software compute the approximating line by minimizing the total 2-norms. In other words, assume a small line is drawn perpendicular to the approximating line to each point. These small line segments are all added up and it is this total that is now minimized.



Some suggestions:
- Since you're raising numbers to integer powers, you can replace expensive calls to pow() with simple multiplication. For example, lines 22-24 could be
1
2
3
    indexPow2 = index * index;
    S30 += indexPow2 * index;   // index^3
    S40 += indexPow2 * indexPow2   // index ^4 

Each statement introduces a sequence point. Use larger expressions in fewer statements to let the optimizer do it's thing. For example lines 50-52 could be:
double discriminant = S40*M11 - S30*M2 + S20*M31;

If I'm reading the code right, the first pass through the loop at line 18 will set all the values to zero, so skip that iteration and start the loop at i=1.
Topic archived. No new replies allowed.