How to use MPI_Gatherv or MPI_Gather after ScatterV

Dear All:

I found the code from:
ref: https://stackoverflow.com/questions/7549316/mpi-partition-matrix-into-blocks/7587133#7587133

How we can also add the MPI_Gatherv or Gather after the scather to get the same initial 2d matrix (example 8*12 from code). So i mean:
I would like to add MPI_Gatherv to get the same input 2d matrix input.
now code is doing :
example: 8*12 matrix --> partition to 6 blocks of 2*3 by using scatterv with printing the same number of input.
8*12 ---> scaterv ---> 6(2*3)

could you please , add the gathherv after the Scatterv to get the same first 8*12 (from 0 to 95):

8*12(from 0 to 95) ---> scatterv ---> 6(2*3) ---> gatherv ---> 8*12(from 0 to 95).

I will be more than happy if anyone reply to my post. since I am a beginner in c++, please include your suggested code in my attached code that i can understand better.

If its possible: i also interested in seeing the usage of MPI_send and MPI_rec in this following code. If it's make the code too complicated or no sense just forget about it.

Thanks alot.

1
2

code in following post
Last edited on
MPI_Scatterv and MPI_Gatherv are essentially reverse operations. The first scatters data from one processor to all the others in the communicator; the second gathers the (presumably-one-day-to-be-processed) data back from the other processors onto a single processor.

"Your" program uses
MPI_Scatterv(a, counts, disps, blocktype, b, BLOCKROWS*BLOCKCOLS, MPI_CHAR, 0, MPI_COMM_WORLD);
Simply reverse the send and receive operations and add the line
MPI_Gatherv (b, BLOCKROWS*BLOCKCOLS, MPI_CHAR, a, counts, disps, blocktype, 0, MPI_COMM_WORLD);
just before your MPI_Finalize statement. If you want to see the outcome just repeat lines 58-66 afterwards.

This program simply distributes an array of numbers to other processors and will now gather them back in again ... and that's it! It doesn't seem to have much relevance to the problems that you have posted in other recent threads.

Personally, I think you would be better packing your own data into a buffer than using a new MPI_Datatype like this. It won't work if the partitions are different sizes. You are also subject to a memory leak because you don't free it.


For basic MPI_Send and MPI_Recv I gave you an example in one of your other threads
http://www.cplusplus.com/forum/general/223807/#msg1025109
Last edited on
@lastchance

Thanks as always for providing me a useful information. I have updated this code based on your comments. Would you please check it and let me know if its correct.

>>>Personally, I think you would be better packing your own data into a buffer than using a new MPI_Datatype like this. It won't work if the partitions are different sizes. You are also subject to a memory leak because you don't free it.

That's right, i should pack my own data and then delete it at the end, but unfortunately i couldn't done it yet. My background in c++ is not enough to remove the class and bring class deceleration inside the main function (i'm sorry i tried alot). I really need to finish this by today and you already told me on the previous thread that i am trying to write far too much in one go, which you were right. I decided to create the input array from this code and do the scatterv and gatherv and finally apply your pack and unpack function but still dont know what is the best way to do it.
I read the book that you referenced and i really like it but still i didn't get that knowledge to done with this programming task.
i will be more than happy if you do the favor for me (again) and help me to finish this task.
Last edited on
I have done the update and its work.
Last edited on
Tnx
Last edited on
do you think this better than my previous approches.


Well, to be fair, that last code is @Jonnin's!
http://www.cplusplus.com/forum/general/224287/#msg1026826
Last edited on
No. I don't think that way.
I just said that this code also was giving me some idea that i can partition the matrix into number of vertical slices and then it may helped. This code is in my previous post that @jonnin's helped.

would you please let me know if my gatherv function is now printing correct in the above code based on your comment, or do i printing something wrong.

Thanks
Last edited on
I don't know why you have the comment
1
2
    // it doent look right to me??? the following statement
    if (rank == 0) {


What's wrong with it? All the data has just been gathered onto root (processor with rank 0) and so only this processor is in a position to print it out.

You have a spurious
printf("Local Matrix:\n");
on line 90 of your last-but-one code. It's not doing anything sensible: remove it.

Having done that, I run the code with two processors and get the output
c:\cprogs\parallel>mpiexec -n 2 temp.exe 
Rank = 0
Global matrix: 
  0   1   2   3   4   5   6   7   8   9  10  11 
 12  13  14  15  16  17  18  19  20  21  22  23 
 24  25  26  27  28  29  30  31  32  33  34  35 
 36  37  38  39  40  41  42  43  44  45  46  47 
 48  49  50  51  52  53  54  55  56  57  58  59 
 60  61  62  63  64  65  66  67  68  69  70  71 
 72  73  74  75  76  77  78  79  80  81  82  83 
 84  85  86  87  88  89  90  91  92  93  94  95 
Local Matrix:
  0   1   2   3   4   5 
 12  13  14  15  16  17 
 24  25  26  27  28  29 
 36  37  38  39  40  41 
 48  49  50  51  52  53 
 60  61  62  63  64  65 
 72  73  74  75  76  77 
 84  85  86  87  88  89 

Global matrix: 
  0   1   2   3   4   5   6   7   8   9  10  11 
 12  13  14  15  16  17  18  19  20  21  22  23 
 24  25  26  27  28  29  30  31  32  33  34  35 
 36  37  38  39  40  41  42  43  44  45  46  47 
 48  49  50  51  52  53  54  55  56  57  58  59 
 60  61  62  63  64  65  66  67  68  69  70  71 
 72  73  74  75  76  77  78  79  80  81  82  83 
 84  85  86  87  88  89  90  91  92  93  94  95 
Rank = 1
Local Matrix:
  6   7   8   9  10  11 
 18  19  20  21  22  23 
 30  31  32  33  34  35 
 42  43  44  45  46  47 
 54  55  56  57  58  59 
 66  67  68  69  70  71 
 78  79  80  81  82  83 
 90  91  92  93  94  95 

which is about what you expect. The first "global matrix" is printed by root just after MPI_Scatterv. The second one is printed out by root just after MPI_Gatherv. The global matrix is the same because your individual processors haven't done anything with their data, just sent it straight back again. So that bit works ... it just isn't very exciting.

Irrespective of where you put MPI_Barrier to attempt to synchronise things, you can't guarantee which order individual processors are going to feed the output buffer for cout, so the local matrices (printed by individual processors) could come in either order. Personally, I always arrange for root to do all the output.
Last edited on
@lastchance
Thanks for reply.
Its printing based on last modification from your comment. I also want to arrange in a way that all the printing done in root.
Now i would like to try to make this code apply to your previous code that you provide it to me.
Do you have any advice that how i can transfer all the declaration from class to main function (not use the class) and do the pack and unpack with using this input matrix. In this programming i am not allowed to use the class!!.
I know i may taking your time too much, but i really couldn't figured out by myself and i really need help.

Thanks
Yes, it's a class - but there's only one instance (per processor). Just move the code from the constructor into main, remove the object identifier (me.) and treat the member functions as normal functions or scraps of code. If it's not a class you need to pass more things as function parameters, that's all.

You will confuse everyone by spawning lots of new topics in this forum.
Last edited on
Thanks.
You are right i should not put new topic in this forum.
I will add my comment in previous forum which is related.
Would you please check my new comment on:
http://www.cplusplus.com/forum/general/223807/
Thanks alot.
Last edited on
Registered users can post here. Sign in or register to post.