can't create more than a certain number of threads

I wrote a raytracing program that takes a very very long time to finish, so I have been trying to multithread it using pthread_create(). It works perfectly for small test images, but it stops working once I increase the resolution. With some couts I've found that it seems to stop making new threads after a certain point. Does anyone know why that might be the case?

What I want to do is this: I want to create a number of threads equal to the number of processors on the computer, and I want each of them to get assigned a ray to trace. Then, if at any point one of them finishes tracing their ray, I'd like that thread to be assigned the next untraced ray. Since the amount of time it takes to trace a single ray can vary vastly depending on what kind of stuff it hits, I want to make sure that I allow each thread to fetch a new ray whenever it finishes the one it is working on, not whenever all of them finish what they're working on.

Here's sort of an overview of what it's doing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81

struct initializerArgs{
	Ray r;
	int x, y, threadID, rayCount;
};

//global array of booleans regarding whether the system is computing this pixel or not
bool isComputing[IMAGEWIDTH][IMAGEHEIGHT];
//the finalized image file
Color image[IMAGEWIDTH][IMAGEHEIGHT];

Color raytrace (Ray ri) {
	Color output;
	//a whole lotta math that takes a very long time to compute and sets output
	return output;
}

void* threadInitializer(void* args) {
	initializerArgs* data = (static_cast<initializerArgs*>(args));
// ============ COUT A ============
	cout << "inside thread " << data->threadID << " computing ray # " << data->rayCount << endl;
	image[data->x][data->y] = raytrace(data->r);
	//all computations are done
	isComputing[data->x][data->y] = false;
	pthread_exit(NULL);
}

int main (int argc, char** argv) {
	//This tells me how many CPUs are attached to the computer. Output has been verified.
	const int cpus = sysconf(_SC_NPROCESSORS_ONLN);
	//create an array of threads equal to the number of CPUs.
	pthread_t threads[cpus];
	//arrays telling the program what x and y values are being computed by each thread.
	int xt[cpus];
	int yt[cpus];
	//array of arguments to initialize the threads with.
	initializerArgs args[cpus];
	int i, j, k, xPtr, yPtr, rayCount;
	bool check = true;
	//since there are always many more height pixels than CPUs, preset wt and ht to the first pixels in the column.
	for (i = 0; i < cpus; i++) {
		xt[i] = 0;
		yt[i] = i;
	}
	//these values traverse the image concurrently with rayCount.
	xPtr = 0;
	yPtr = 0;
	rayCount = 0;
	while (check) {
		//check is set to false now, then set to true for any event that tells the system that the raytracer still has work to do
		check = false;
		for (i = 0; i < cpus; i++) {
			if (rayCount == IMAGEWIDTH * IMAGEHEIGHT) {
			//if there are no more rays to assign, just check if the processes are done computing yet, set check to true if not
				if (isComputing[xt[i]][yt[i]]) {
					check = true;
				}
			} else {
				//since there are more rays to assign, we know the system isn't done yet.
				check = true;
				if (!iscomputing[xt[i]][yt[i]]) {
					//if this thread is not computing anything right now
					args[i].r = getRay();
					args[i].x = xPtr;
					args[i].y = yPtr;
					args[i].threadID = i;
					args[i].rayCount = rayCount;
// ============ COUT B ============
					cout << "beginning thread " << i << " computing ray # " << rayCount << endl;
					pthread_create(&threads[i], NULL, threadInitializer, &args[i]);
					rayCount++;
					yPtr++;
					if (yPtr == IMAGEHEIGHT) {
						yPtr = 0;
						xPtr++;
					}
				}
			}
		}
	}
}

So, with IMAGEHEIGHT and IMAGEWIDTH set to a low value, I get a chain of "beginning thread [a number 0-11, since this computer has 12 CPUs] computing ray # [increasing sequentially from 0 to IMAGEWIDTH * IMAGEHEIGHT]" interspersed with the "inside thread" versions of those lines. Both report rayCount values all the way up to IMAGEWIDTH * IMAGEHEIGHT - 1, like you would expect. However, with a larger image, I get the "beginning thread" version all the way up to IMAGEWIDTH * IMAGEHEIGHT, but I only get "inside thread" versions up to about 32745.
That would seem to tell me it's not creating more threads after that point, even though I'm limiting the number of threads the program makes to 12 at a time. Supporting this idea is the fact that the number it stops at is suspiciously close to 32768, which is a power of 2 value, so it might be some kind of hard limit on the number of threads I can create. Am I doing something wrong here with the way I close out my threads?
Last edited on
Either you have a 32 bit system and your processor can't count any higher or it's a limitation with the version of pthreads you are using. I think pthreads is a *nix thing isn't it? Some one in that section may know more about this.
Okay, I derped. After reading some tutorials, I tried to make the threads detached, since the tutorials said that would allow the system to write over their memory as soon as they terminated, and it then stopped at a higher, but still arbitrary value.

It was then that I realized I never updated xt and yt. Setting xt to xPtr and yt to yPtr before updating those values fixes the problem entirely. So detaching the threads fixed the problem of not allowing more than a certain number at a time, and updating the values of wt and yt fixed the problem at it still stopping at an arbitrary amount.
Topic archived. No new replies allowed.