Matrix Multiplication using multi-threads gets segmentation fault

Hello, I'm trying to write a program that calculates up to a 1024 x 1024 matrix using multi-threading. For example, I need to run a 1024 x 1024 using 256, 64, 16 or 4 threads. Or I need to run a 64 x 64 matrix using 16 or 4 threads. All the Matrices are square. I thought I coded my program correctly, however I get a segmentation fault when I use a 720 x 720 matrix or higher, heres the code.
 ``123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081`` ``````#include #include #include using namespace std; const int DIM = 720; //works up to 719, crashes at 720 const int num_of_thr = 4; int matrix_A[DIM][DIM]; int matrix_B[DIM][DIM]; int c[DIM][DIM]; struct v { int i; int j; }; //worker thread void* matrix_multi(void* data) { for(int i = 0; i < DIM; i++) { for(int j = 0; j < DIM; j++) { c[i][j] = 0; for(int k = 0; k < DIM; k++) { c[i][j] += matrix_A[i][k] * matrix_B[k][j]; } } } pthread_exit(0); } int main() { pthread_t thr_id[DIM][DIM]; pthread_attr_t thr_attr; pthread_attr_init(&thr_attr); //Filling the Matrices for(int i = 0; i < DIM; i++) { for(int j = 0; j < DIM; j++) { matrix_A[i][j]= i + j; matrix_B[i][j] = i + 3; } } //create the threads for(int i = 0; i < num_of_thr/2; i++) { for(int j = 0; j < num_of_thr/2; j++) { struct v *data = (struct v *) malloc(sizeof(struct v)); data->i = i; data->j = j; pthread_create(&thr_id[i][j],NULL,matrix_multi, &data); } } //joining the threads for(int i = 0; i < num_of_thr/2; i++) { for(int j = 0; j < num_of_thr/2; j++) { pthread_join(thr_id[i][j],NULL); } } return 0; } ``````

Any help would be appreciated, thanks in advance.
The main problem is that you're overflowing the stack by allocating a huge matrix on it. Move to dynamic memory.

Also:
1. Unless you have 256 physical processors, you're not using threads properly.
2. Your matrix_multi() never uses its parameters. You're just using more threads without dividing any work.
3. You're using malloc() for no apparent reason.
4. You're passing to a thread a pointer to a pointer that will likely go out of scope before the thread can use it. (See line 65. You're passing &data instead of data, which is what you should be passing.)
5. It's usually not a good idea to parallelize matrix operations across more than one dimension. For example, if you need to add two matrices on n threads, divide the destination matrix every height/n rows and give each thread those rows only. That will greatly simplify your code. Alternatively: