Synchronize only certain processors in MPI

Is it possible to use MPI_Barrier to synchronize only certain processors? I have all processors but one doing calculations within a for loop. I wish to have them synchronize at the end of every iteration. However, since there is one processor that never enters that for loop, I think the program would never proceed because it would be waiting for the last processor to reach that line, but it never will.
Why don't you let that last processor enter the loop? All processors have access to the same code. A simple if statement would stop that particular processor doing anything in the loop other than waiting for MPI_barrier.

Otherwise you would have to use a different MPI_Communicator for all the remaining processes.

Use MPI_Barrier as rarely as possible (outside of debugging). It defeats the whole object of parallelisation.
@lastchance Here is the code I'm working with. The for loop will iterate a certain amount of times, each time generating a random number. I wanted to put an MPI_Barrier at the beginning of the for loop so that no problems arise when rank 0 is collecting and processing the information.

Two questions:
1. I know that you can choose the data count when sending. Is it possible for me to change it to 3 and thus allow me to send all 3 variables at once? That way I could put the MPI_Send outside the for loop and not worry about using MPI_Barrier.
2. Could you show me a sample based off the code I provided that would solve the problem using a simple if statement?


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if(rank != 0)
{
	for (int i = 0; i < pValue; i++)
		{
		        MPI_Barrier(MPI_COMM_WORLD);
			unsigned seed = static_cast<unsigned> (std::chrono::system_clock::now().time_since_epoch().count());
			std::default_random_engine generator(seed);

			// create a distribution that generates random numbers between 0.0 and 1.0 inclusive
			std::uniform_real_distribution<double> distribution(0.0, 1.0);
			double randomX = distribution(generator);

			double randomY = distribution(generator);

			double check = pow(randomX, 2) + pow(randomY, 2);
			MPI_Send(&check, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
		}
}
You don't need to use MPI_Barrier. Full stop.

Processor with rank = 0 should have a corresponding MPI_Recv (which I can't see anywhere in your code). This will include the sender's rank for identification. Transfers between any particular pair of processors are guaranteed to arrive in sequential order. I can't see why there should be any problem with the collecting and receiving of data. All MPI_Barrier will do is slow your code down.

If you are sending multiple pieces of data then you should collect them all in a buffer array and send them all at once. At the moment you are sending just one value of check at a time; that is very inefficient and slow.

Remove your MPI_Barrier call. Create a buffer of size pValue. Generate your values and put them into that. Then do a single MPI_Send with all pValue data points after the loop that generated them has finished.

https://www.mpich.org/static/docs/latest/www3/MPI_Send.html
https://www.mpich.org/static/docs/latest/www3/MPI_Recv.html

There are faster, non-blocking, send and receive commands, but I guess that's for another day.
Last edited on
@lastchance I had the MPI_Recv in a section of the code that I did not post. Thank you for clarifying how MPI works. I did as suggested and it worked great.

Can you describe to me a situation in which MPI_Barrier would be necessary?

I only use MPI_Barrier when debugging parallel code, to confirm whether or not it is the inter-processor communication that is causing problems. Another possible reason for using it is if all processors are involved in output and that output must come in a particular real-time order. (I don't do that: I code so that only the root processor does any input/output.)

In most circumstances MPI_Barrier should be unnecessary. Since memory is distributed, not shared, a processor should be able to get along with its job and need not worry about how far in front or behind the other processors are. When it needs data from elsewhere it can call MPI_Recv (or MPI_Sendrecv, etc.) ... and it will not progress beyond that point until it has the data it needs.
Topic archived. No new replies allowed.