Parallellization using OMP.

Hello,

I have a piece of code that I believe could benefit from parallellization. In short, it's a bunch of independent calculations (embarassingly parallel). However, if the result of the calculation is a specific value, it is added to the Heap.

1
2
3
4
5
6
7
8
9
10
11
12
13
#pragma omp parallel for private(val) schedule(guided) num_threads(8)
for (int i = 0; i < n; ++i) {
   doSomeCalc(i);
   for (int j = 0; j < n; ++j) {
      if (someCheck) { (~80% pass)
         val = doOtherCalc(i, j);
         if (val > LIMIT) (~0.2% pass)
           #pragma omp critical
           { addToheap(i, j); }
         }
      }
   }
}

Even though only a very small portion of the entries needs to be Heaped (~0.2%), the critical section kills any benefit of parallellization.

I've worked around it by doing the calculations first, in parallel, and then running over all results and heaping when necessary. It provides better results, but I feel like I'm wasting potential, as the Heap is the bottleneck in the parallel scheme.

Any idea how to fix this? I feel like the critical section locks too much. My first idea would be to have one dedicated thread for adding things to the Heap, but I'm not sure how to implement this.

Any and all advice is welcome!
Gaminic wrote:
My first idea would be to have one dedicated thread for adding things to the Heap, but I'm not sure how to implement this.


Pseudocode:

array add_to_heap;

1. Create a thread that runs a function.
1a. Wait for a condition variable.
1b. empty the valid array elements into the heap. (Use atomic numbers).
1c. repeat

2. Have each of your asynchronous threads evaluate whether their result should be heaped.
2a. If result is to be heaped:
2b. Add to add_to_heap[x] where x is unique for each thread.
2c. notify condition variable.
2d. else just return, I guess.
Topic archived. No new replies allowed.