TBB algorithms load balance, paricularly parallel_for and parallel_for_each

Dear experts

I sometimes think about load balance in TBB algorithms.
I know TBB automatically balance task's loads for each thread.

Here, what is difference between tbb::parallel_for and tbb::parallel_for_each in view of load balance?
(Q: do tbb alogrithms (e.g. parallel_do) automatically balance loads according to data chuncks?)
Such as the below codes,
parallel performance between them is almost the same, particularly?
(Or, parallel_for has some advantage over parallel_for_each?)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
std::vector<Obj> v;

// initialize v

tbb::parallel_for(tbb::blocked_range<std::size_t>(0,v.size()),
	[&](const tbb::blocked_range<std::size_t>& r) {
	  for(size_t j=r.begin(); j!=r.end(); ++j){
	    // Do something to elements in v  
          }
});

tbb::parallel_for_each(v.begin(), v.end(),
	[&](auto& v_elem) {
	    // Do something to elements (v_elem) in v            
});


Last edited on
I apologize for my uncertain question.

Though not problems of load balance,
I understand that tbb::parallel_for benefits from cache-efficient way by sequential access.
(I observed dramatic performance up)

I will investigate this topic by myself.

Kind regards
Topic archived. No new replies allowed.