The use of "->"

i am trying port the cpu matrix multiplication to gpu,below is a part of cpu mmul
MMUL(K1, tmp1,tmp2);
Instead of this i will use this below to run on gpu using opencl, is this correct, what is this "->" for. tmp1 and tmp2 was asked to seperate into real and imaginary part since matrices a,b&c are declared *ar *ai as u can see below
in void GPUmmul
GPUmmul(K1, tmp1->real,tmp1->imag,tmp2->real,tmp2->imag);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void GPUmmul(cl_float *ar,cl_float *ai, cl_float *br,cl_float *bi, cl_float *cr,cl_float *ci, long rb, long cb, long cc)
{
	long r,c,m,tb,tc,ta;
	cl_float tmpr,tmpi;
    
	#pragma omp parallel for private(r,c,m,tc,tb,ta,tmpr,tmpi) 
	for(r=0;r<rb;r++)
	{
		//
		for (c=0;c<cc;c++)
		{
			tmpr=0.;
			tmpi=0.;
			ta=r*cc+c;
			for(m=0;m<cb;m++)
			{   
				tb=r*cb+m;
				tc=m*cc+c;
				tmpr += br[tb]*cr[tc]-bi[tb]*ci[tc];
				tmpi += bi[tb]*cr[tc]+br[tb]*ci[tc];
			}
			ar[ta]=tmpr;
			ai[ta]=tmpi;
		}
	}


Show the type of tmp1.
it's declared like this, a special type. I will show fcomplex too below..
fcomplex R1,tmp1,tmp2,iR2,K1,K2;

1
2
3
4
5
6
7
        R1.zeros(4*K,4*K);
	//%R2=zeros(4*K,4*K);
	tmp1.zeros(4*K,4*K);
	tmp2.zeros(4*K,4*K);
	iR2.zeros(4*K,4*K);
	K1.zeros(4*K,4*K);
	K2.zeros(4*K,4*K);


Last edited on
1
2
3
fcomplex tmp;
tmp.zeros( X, Y );
foo( tmp->real );

Ok, no declaration of fcomplex::zeros() as far as I can see.

tmp is not a pointer type, so dereferencing with -> should not be possible, unless fcomplex overloads the -> (but you did not show that either).
ived edited the fcomplex to complete
any changes in answer
Does the same function that declares tmp1 also call the GPUmmul?
This is the function which contains matrix multiplication(cpu) which I want to port to GPU using opencl. (commented out are the matlab version of the code)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
MMUL(K1,  tmp1,tmp2); 
		//% K1 = (R1 - dy^2/48*iR2)\(eye(4*K) + i*(3-2*sqrt(3))/12*dy*iR2);

        #pragma omp parallel for private(ii,jj)
		for (ii=1;ii<=4*K;ii++)
		{
			for (jj=1;jj<=4*K;jj++)
			{
				tmp2.ld(ii,jj,  cmplx(0,(3.f+2.f*sqrt(3.f))/12.f*dy)* K1(ii,jj));
			} //end
			tmp2.ld(ii,ii,   tmp2(ii,ii)+(1.f)); // %eye
		} //end
		//%K2 =               iR2 *(eye(4*K) + i*(3+2*sqrt(3))/12*dy* K1);
		//%K2 = iR2*tmp2;
		MMUL(K2,  iR2,tmp2);
    
		#pragma omp parallel for private(ii,jj)
		for (ii=1;ii<=4*K;ii++)
		{
			for (jj=1;jj<=4*K;jj++)
			{
				tmp2.ld(ii,jj,  cmplx(0,dy/2)*(K1(ii,jj)+K2(ii,jj)));
			} //end
			tmp2.ld(ii,ii,   tmp2(ii,ii)+(1.f));// %eye
		} //end
		//%Fnm = (eye(4*K) + i*dy/2*(K1+K2))*Fnm; 
		MMUL(tmp1,  tmp2,Fnm);
		Fnm=tmp1;
	} //end
} //end 
[/code]
Last edited on
I cannot see -> used there.
Instead of
MMUL(K1, tmp1,tmp2);
I was asked to use this
GPUmmul(K1, tmp1->real,tmp1->imag,tmp2->real,tmp2->imag);
GPUmmul instead of the normal cpu MMUL and the tmp1 and tmp2 are said to be written seprately for the real and imaginary part since the GPU cant take them whole since the 2GB graphics cards has limits to about 8000by8000 matrix.
I dont even know the basic meaning of using "->" ,is it pointing them as array?
the input parameters where i use to vary for more accurate and intensive results also use it like below...in the respective function..
[code]void ps_Kugel_2D(DMDATA *p)
{
p->dim = 2;
p->dx = 10000; //% Grating period
p->dz = 10000;
p->Nx = 5; //5; //%Moden
p->Nz = 4; //4;
Last edited on
neonano wrote:
I dont even know the basic meaning of using "->" ,is it pointing them as array?

At bottom of http://www.cplusplus.com/doc/tutorial/classes/
x.y member y of object x
x->y member y of object pointed by x
(*x).y member y of object pointed by x (equivalent to the previous one)

So,
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct Foo {
  int gaz;
  void bar();
};

...
  Foo a;
  Foo * b = &a; // b points to a
  a.gaz = 42;

  // all three call member function of same object 'a':
  a.bar();
  b->bar();
  (*b).bar();

  assert( a.gaz == b->gaz );

The tmp1 and tmp2 are not pointers. Therefore, the call should be:
GPUmmul( K1, tmp1.real, tmp1.imag, tmp2.real, tmp2.imag );
OK, Great. Thanks for the clearing that doubt !!
and do you think there should be anyother changes to be done in the function shooting_imp_2D for the GPUmmul to work..!!
Can you have a look..!!
Topic archived. No new replies allowed.