Matrix Multiplication using multi-threads gets segmentation fault

Hello, I'm trying to write a program that calculates up to a 1024 x 1024 matrix using multi-threading. For example, I need to run a 1024 x 1024 using 256, 64, 16 or 4 threads. Or I need to run a 64 x 64 matrix using 16 or 4 threads. All the Matrices are square. I thought I coded my program correctly, however I get a segmentation fault when I use a 720 x 720 matrix or higher, heres the code.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#include <iostream>
#include <stdio.h>
#include <pthread.h>

using namespace std;


const int   DIM = 720; //works up to 719, crashes at 720
const int   num_of_thr = 4;
int         matrix_A[DIM][DIM];
int         matrix_B[DIM][DIM];
int         c[DIM][DIM];

struct v
{
    int i;
    int j;
};

//worker thread
void* matrix_multi(void* data)
{    
    for(int i = 0; i < DIM; i++)
    {
        for(int j = 0; j < DIM; j++)
        {
            c[i][j] = 0;
            for(int k = 0; k < DIM; k++)
            {
                c[i][j] += matrix_A[i][k] * matrix_B[k][j];
            }
        }
    }
    pthread_exit(0);
}

int main()
{

    pthread_t thr_id[DIM][DIM];
    pthread_attr_t thr_attr;
    pthread_attr_init(&thr_attr);



   //Filling the Matrices
    for(int i = 0; i < DIM; i++)
    {
        for(int j = 0; j < DIM; j++)
        {
            matrix_A[i][j]= i + j;
            matrix_B[i][j] = i + 3;
        }
    }


    //create the threads
    for(int i = 0; i < num_of_thr/2; i++)
    {
        for(int j = 0; j < num_of_thr/2; j++)
        {
            struct v *data = (struct v *) malloc(sizeof(struct v));
            data->i = i;
            data->j = j;
            pthread_create(&thr_id[i][j],NULL,matrix_multi,  &data);
        }
    }

    //joining the threads
    for(int i = 0; i < num_of_thr/2; i++)
    {
        for(int j = 0; j < num_of_thr/2; j++)
        {
        pthread_join(thr_id[i][j],NULL);
        }
    }

     return 0;
}


Any help would be appreciated, thanks in advance.
The main problem is that you're overflowing the stack by allocating a huge matrix on it. Move to dynamic memory.

Also:
1. Unless you have 256 physical processors, you're not using threads properly.
2. Your matrix_multi() never uses its parameters. You're just using more threads without dividing any work.
3. You're using malloc() for no apparent reason.
4. You're passing to a thread a pointer to a pointer that will likely go out of scope before the thread can use it. (See line 65. You're passing &data instead of data, which is what you should be passing.)
5. It's usually not a good idea to parallelize matrix operations across more than one dimension. For example, if you need to add two matrices on n threads, divide the destination matrix every height/n rows and give each thread those rows only. That will greatly simplify your code. Alternatively:
1. You start with threads [0;n-1].
2. Thread i starts processing at row i.
3. Once it's done it advances by n rows.
For n=4: thread 0 processes rows 0, 4, 8, ...; thread 1 processes rows 1, 5, 9, ...; and so on.
Last edited on
Topic archived. No new replies allowed.