I wonder Whether code 1 is right or wrong and then if it is efficient or not?
one more question if code 1 is right then why #floating point ops in code 1 and 2 are different while dimension and matrices are defined in compile time and not in run-time as in code 2.
for example code 1 floating point operation result for size 16 is 24617 and result for code 2 with the same size is 8201.
int main(){
for(int i=1;i<64;i*=2)
test(i);
}
void test(int INDEX){
float matrixa[INDEX][INDEX], matrixb[INDEX][INDEX], mresult[INDEX][INDEX];
//fill matrixa and b with random number
//calculate
for (i = 0; i < INDEX; i++)
for (j = 0; j < INDEX; j++)
for (k = 0; k < INDEX; k++)
mresult[i][j] = mresult[i][j] + matrixa[i][k] * matrixb[k][j];
}
[/code
code 2
[code]
#define INDEX 16float matrixa[INDEX][INDEX], matrixb[INDEX][INDEX], mresult[INDEX][INDEX];int main(){
test(16);
}
void test(int INDEX){
//fill matrixa and b with random number
//calculate
for (i = 0; i < INDEX; i++)
for (j = 0; j < INDEX; j++)
for (k = 0; k < INDEX; k++)
mresult[i][j] = mresult[i][j] + matrixa[i][k] * matrixb[k][j];
}
Just as a general note, you may want to look into using std::array instead of the old style of array.
This will allow you to use all the STL algorithms ( http://cplusplus.com/reference/algorithm/ ), which include algorithms for filling a container, as well as doing arithmetic with containers.
Those algorithms should be optimised for efficiency and are normally well tested and documented.
you are right but I need some how using low level definition of arrays with out pointer and also I need sigle chunk of memory allocated for my 2d array but array of pointer to pointer or vectors which is dynamic array uses seperate chunk of memory that have less locality.besides that this question comes to my mind that what is the root of this variant in my experiments.anyway thanks for your comments it was useful.
The first code written in C++ is invalid. C++ does not support VLA.
The second code is also invalid and shall not be compiled because after preprocessing you will get the following declaration
The first example is compiled because gcc has its own non-standard language extensions.
The second example either shall not be compiled or you wrote it here incorrectly.
for example code 1 floating point operation result for size 16 is 24617 and result for code 2 with the same size is 8201
Well I dont know why but i run exact code and the result is as above.but my main question is the different result in number of flops. The theoretical flops for matrix multiplication as I did is 2n^3=2*16*16*16=8,192 whis take place for second code (8201) but for the first code it is 24617 something like 6n^3