How to create processes and use them with fork() and pipe()

Pages: 12
I am trying to figure out how to use fork() to create a process and then pipe() to communicate between those processes. I have watched a few videos and read a few articles and I think I get the basic idea of what a fork does, but not how I could use it for what I need it for. Same goes for pipes.

What I am needing to do is take a Dataflow graph like the one below:
1
2
3
4
5
6
7
8
9
10
11
12
input_var a,b,c,d;
internal_var p0,p1,p2,p3;
a -> p0;
- b -> p0;
c -> p1;
+ d -> p1;
a -> p2;
- d -> p2;
p0 -> p3;
* p1 -> p3;
+ p2 -> p3;
write(a,b,c,d,p0,p1,p2,p3)

and do the calculations using fork() and pipe()
The way I understand the above is that all the input_var's and internal_var's are processes and the arrows represent the data in the left process going to the right process through a pipe. I am given integer values for the input_var's (2, 3, 1, 8 for the above). I am also supposed to do the calculations above and store them in the indicated process (ex: p0 runs a-b, so that write statement at the bottom should return the result of a-b for p0)

I just don't understand how to use fork() for this. From what I understand it splits one process into two and it is required to create a process, but the above looks more like I'm turning two processes into one (a->p0 and -b->p0).
Can anyone please explain? I'm lost
You are correct that fork() makes a new copy of the current process. fork() returns 0 in the child process and the process ID in the parent process. I believe it returns -1 if it fails.

The file descriptors are shared between the two processes. This is what lets pipe() work.

The key here is that you have to call pipe() before you call fork().

After the fork, one process writes to the pipe and the other process reads from it. Typically each process closes the end that it isn't using (i.e., if you're reading the pipe, you close the writing end).

Hope this helps!
Thank you for the reply.
That's pretty much what I've been finding online. I found this one program here: https://www.geeksforgeeks.org/c-program-demonstrate-fork-and-pipe/
And I kind of see what they're doing there, but I'm just not sure how to apply that to my program.
I'm thinking maybe a->p0 could be a fork a(parent) to p0(child), but then I get the next line that says -b->p0 where I would have to create a new process for b, but I already have the p0 one, so I don't need to split b into two processes.
I know from this link: https://www.geeksforgeeks.org/multiple-calculations-4-processes-using-fork/
That you can have 3 child processes (although it confuses me because they call fork twice, so isn't that two parents and two children?) So I'm thinking maybe it would make more since that p0 is the parent and a and b are the children. And however I end up doing it I have to somehow make it where my program can handle different numbers of variables. I can have up to 10 input_var's and 10 internal_var's. I'm beginning to see how fork works, but I've yet to see an example on how to put it in a function or whatever so that it can be used like I need to use it.
Without knowing more about the problem you're trying to solve, it's hard to say how fork() can help you. Can you post a version of the program that runs as a single process?
I haven't really got that part of the program done at all yet. But I can try to explain what I'm needing to do.
I am given the dataflow graph below:
1
2
3
4
5
6
7
8
9
10
11
12
input_var a,b,c,d;
internal_var p0,p1,p2,p3;
   a -> p0;
 - b -> p0;
   c -> p1;
+ d -> p1;
   a -> p2;
 - d -> p2;
   p0 -> p3;
* p1 -> p3;
+ p2 -> p3;
write(a,b,c,d,p0,p1,p2,p3)


This graph gives the names of the input_var's, but I will be given values to go with those names (In the above they are a=2, b=3, c=1, and d=8). I am then supposed to process the part of the graph with the arrows. Every variable (input or internal) is supposed to be a process and each arrow is supposed to represent a pipe. I am then supposed to perform each operation in the indicated process.
For example:
a -> p0
-b -> p0
For this I would perform a-b in p0 (2-3)
Then after I am done with the arrow section I think am supposed to write the values held in each process to an output file.
So the output for the above example should be:
1
2
3
4
5
6
7
8
a = 2
b = 3
c = 1
d = 8
p0 = -1
p1 = 9
p2 = -6
p3 = -15


I know I have to do this using fork() and pipe() and that all the variables (input and internal) have to be processes. I also know that there can be up to 10 of each type of variable, but I don't know the exact number there will be.

I am just sort of lost on how to do this with fork()'s. I know how to parse the graph, I'm just having trouble with the creating and using processes part.
I found this example online: http://www.sfu.ca/~reilande/
He has a program that has 1 parent and 6 child processes where the parent is just the control process. This gave me the idea that maybe each input_var could be a child process and the internal_var's go in the parent process and then each child passes it's value to the parent where it performs the specified operation (ex: a-b for p0) then I guess it would store the result in an array.

What I've done so far for coding is basically just put the graph into a vector where each element is a line. It reads the list of input_var's and puts those names into a vector, then it reads the file that holds the values for the input_var's and inserts each value into another vector. This way I will have two parallel vectors: IptVarNames and IptVarVals.
I do the same thing with the names of the internal_var names and I'm thinking I could just create another vector to hold the internal_var values that will be calculated later.

The part that I am stumped on is the one that involves fork() and pipe().

Sorry for the really long post, but I wanted to try to explain thoroughly.
This seems odd to me. Can you post the actual text of the assignment? Can you post the exact format of the input you'll receive, preferably with an example if one is available.

From what you've described, I'd have one controlling parent process. That process reads the input, forks off processes p0-p3. Each child keeps track of it's variable. The parent then sends commands to each of those processes via the pipe. The set of commands it sends are:
= num // set value to num
+ num // add num to value
- num // etc.
* num
STOP // stop and return value to parent process.

How will the children return their value to the parent? The parent could use the return value of the process, but that's typically limited to 127. I don't think you should use the pipe that flows from parent to child, so maybe a second pipe that passes the answer back to the parent?
The problem with that is that the input variables (a,b,c,d) need to be processes too.
Here is the link: https://bit.ly/2pxtpVk

I was thinking the single controlling parent process too. I guess a,b,c,d would get their values from my vector that got them from the input file, then they would pass their values back to parent where they could be sent to each p# process as they were needed?
I was thinking the parent process could have an array that stores the values of each variable, but I don't see how that could work because as far as I understand it, either parent runs then children or children run then parent. So if parent runs first, then it COULD pass the vectors holding the results, but they wouldn't be populated yet because the children haven't run yet!
Could you post a link that isn’t just a redirect to your professor’s home page?
Oh sorry. I copied and pasted the url that was in the address bar and put it in bit.ly. For some reason I got the homepage url instead...
Here's the link to the pdf (I tested it this time): https://bit.ly/2zu0APu
Your assignment is not well-worded, and includes some weird non-standard argument handling, and a nice trip-up dealing with multiple delimiters between inputs, LOL. He is not specific on output. (Your prof is a jerk.)

I am going to play with it and I’ll report back with what bare minimum you need to complete the assignment. (I have a pretty good idea now, but I don’t want to steer you wrong.)

[edit]

LOL, have you studied dependency trees or anything like that in this course (or any previous)? Ever done a topological sort?

[edit 2]

After looking this over, I think this HW requires quite a bit more than simply spawning a few processes. There is no IPC synchronization outside of a simple spawn-wait in this assignment; it should be fairly deterministic to decide which process runs and when. If this isn't a 350 to 400 level course at minimum you're pretty screwed.

It might be worth scheduling a visit with your prof this weekend to get a better idea of what he expects you to do here.
Last edited on
Well if by delimeters you mean the comma's/spaces between 2,3,1,8/2 3 1 8 he said that we could pick which one we prefer and just state it in our readme.txt. Same with where it says either an input file or stdin.

Dependency trees? Topological sort? no

This is a junior level class (it's 3000 level at my university)

I talked with him yesterday and I told him about an idea I had and he said it might work if they run the way I want them, and then he suggested another idea. I'll try to explain each below:

For both ideas I think the way I would handle pipe creation is basically count how many arrows there are and make an array of pipes that size.

My idea:
Have two parents. One parent is control process for input variables and the other is for internal variables.
Each input variable is a child and it get's the value (a=2) from my vector that holds the values it got from the file. Then each of these input_var children have their own pipe that they write their value to.
Each internal variable is a child that gets the operation it's supposed to perform from an array or something, then uses the pipes to read in the input_vars it needs and stores them in some variable inside the process (that part may be unnecessary) then once it has the input values it closes the read end of those pipes so that some other process can use them. Then it performs the operation and writes that result to its own pipe. That way when another internal_var process like p3 needs them it can read the result.

I explained this to him and he seemed unsure, but that it may work. He also told me that if I forced them to run in a certain order (a,b,c,d,p0,p1,p2,p3), then it would defeat the purpose of the assignment so...

His idea (as best I can explain it):
I have one main parent and I make all the internal_var's children (p0,p1,p2,p3). When each of these runs it checks what it is supposed to do (p0 does a-b), it then forks off for each variable it needs, making a and b child processes in p0's case, then those child processes do basically what they did in my idea, they just have a pipe that passes their value to the whoever needs it. Then p0 does its operation and I guess writes it to a pipe like they do in my idea.

The problem I had with this idea is that what happens when a process needs variables that have already been turned into processes? like p2 for example, it wants a and d which would have already been created. He said that for this I should use execv() or something like that? I've looked execv() up, but I'm still pretty lost on how to use it.

All he taught us in class was the basic idea on what pipe and fork does and what they're for, but not really any code examples or in depth explanations on how you would implement them. He mentioned execv() and dup(), but not exactly what they were or how to use them. He also never mentioned "IPC" although he did mention process synchronization vaguely.
Last edited on
There are three types of processes here. The main process, the input var processes and the internal var processes.

Input Process
Each one has a "value" variable and a collection of output file descriptors that it writes its value to. The FDs are pipes, but the process doesn't care about that.

Internal Var Process
Each one has a "value" variable that's initialized to zero. It also has a vector of file descriptors that it will read and a list of operations to perform on the values. You might pass the operations on the command line. Finally, it has a collection of output file descriptors. Once it has computed its value (i.e., after reading all the input FDs and performing the operations, it writes its output value to each of the output FDs and exits.

Main process:
This one is more complex:
- read the first two lines of input to determine which processes must be spawned.
- while reading the remaining input, construct the appropriate pipes and add their
files descriptors to the appropriate input/output collections for the processes.
- Each output process gets a pipe back to the main process.
- fork each process.
- wait for the procs. As the output processes complete, read the pipe from the output orc to the main proc to get the output process's value.
- when all processes have completed, print the results.

You might store the files descriptors and values in a small class. When you fork, you just have to tell the child which class instance it should use.
Sounds good I think.
Sorry if I am misunderstanding you, but I'm really tired atm (d*** hw). Are you saying do something more like his idea or mine? I like mine because I still have no idea what I would do with his when an internal_var process like p2 wants to use process "a" which has already been created by p0. At least I'm not sure how to handle this if I'm not hardcoding this graph.

The idea I'm currently working on is in main before I create pipes or processes, I read the entire graph to figure out what each process is doing and store it in an array, so that I'll have an array something like: ["a-b", "c+d", "a-d", "p0*p1+p2"] and then if I have the fork processes in a loop I can just have each process do array[i] to check what operation it's supposed to complete. Then I got to get it to read the operation to find which input_vars it needs and read the values from those processes.

I'm thinking this would be easier if I create all processes ahead of time rather than go with his idea of forking as I go, but I'm not even sure if my idea will work.

EDIT: Ok something I just realized about my idea is that I still have no control over what process runs when. I just ran this code to test my idea of using an array that holds the operations I need by seeing if the child process can view the value of i and it can but.....

test code:
1
2
3
4
5
6
7
8
9
10
11
for(int i=0;i<5;i++) // loop will run n times (n=5) 
    { 
        if(fork() == 0) 
        { 
            printf("[son] pid %d from [parent] pid %d\n",getpid(),getppid()); 
            cout << "This is i's current value: " << i << endl;
            exit(0); 
        } 
    } 
    for(int i=0;i<5;i++) // loop will run n times (n=5) 
    wait(NULL);


the somewhat discouraging result:
1
2
3
4
5
6
7
8
9
10
[son] pid 13726 from [parent] pid 13718
This is i's current value: 3
[son] pid 13727 from [parent] pid 13718
This is i's current value: 4
[son] pid 13725 from [parent] pid 13718
[son] pid 13724 from [parent] pid 13718
This is i's current value: 1
[son] pid 13723 from [parent] pid 13718
This is i's current value: 0
This is i's current value: 2 


It's good because it shows me that I can create many children from the same parent, but it's bad because what happens if process 4 is p3 and it needs the result that process 2 is going to calculate?

Unless pipes would fix this. I feel like I heard a process will wait until it reads something if it has a read statment. Would that fix it or make it even more of a rat's nest?
Last edited on
You can spawn the processes in any order you wish. Part of the point of this (horribly designed) homework is to control process order with the pipes. Processes wait on input until it can be read, and then write to the output, which enables the next process to read, etc.

[edit]

Oh, almost forgot: When you presented your idea, your professor essentially said, yeah, that might work, but...

Translated plainly: don’t do it that way; do it this way instead.

Do it the way your professor suggested.
Last edited on
Oh ok, well I can try to do it his way, but I'm still unsure of what to do when a process like p2 wants to use processes that p0 and p1 have already created (a and d). He said I could basically just use execv() to get it to run the code in the existing processes, but I don't know how to use that, and the info I found online hasn't really helped much. He might rather me do it his way, but he's not going to fail it just because I didn't do it that way (his TA's are the ones that grade it anyway), so I just want to do it whichever way is simpler and easier.

EDIT: I played around a little more in my test program. Here's the code I was messing with:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
int pipes[5][2];

for(int i = 0; i < 5; i++)
	{
		pipe(pipes[i]);
	}

int arr[] = {2,3,1,8};

	for(int i=0; i < 4; i++)
	{
		int pid = fork();
		if(pid == 0)
		{
			printf("[son] pid %d from [parent] pid %d\n", getpid(), getppid());
			cout << "This is i's current value: " << i << endl;
			cout << "ARR: " << arr[i] << endl;
			write(pipes[i][1], &arr[i], SIZEINT);
			exit(0);
		}
	}
	for(int i=0; i<4; i++)
		wait(NULL);

	for(int i=0; i < 4; i++)
		{
			int pid = fork();
			if(pid == 0)
			{
				printf("[son] pid %d from [parent] pid %d\n", getpid(), getppid());

				int val = 0;
				read(pipes[i][0], &val, SIZEINT);
				cout << "this is val from set 2: " << val << endl;
				exit(0);
			}
		}
		for(int i=0; i<4; i++)
			wait(NULL);


And here's the output:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[son] pid 15115 from [parent] pid 15107
This is i's current value: 3
ARR: 8
[son] pid 15113 from [parent] pid 15107
[son] pid 15112 from [parent] pid 15107
This is i's current value: 1
ARR: 3
This is i's current value: 0
ARR: 2
[son] pid 15114 from [parent] pid 15107
This is i's current value: 2
ARR: 1
[son] pid 15116 from [parent] pid 15107
SET2: I's current value: 0
this is val from set 2: 2
[son] pid 15117 from [parent] pid 15107
[son] pid 15118 from [parent] pid 15107
SET2: I's current value: 2
this is val from set 2: 1
SET2: I's current value: 1
this is val from set 2: 3
[son] pid 15119 from [parent] pid 15107
SET2: I's current value: 3
this is val from set 2: 8


So everything is being sent, but it's all in a fairly random order, but if I am getting the first set to send it's value through a pipe from an array of pipes, then does the order matter? Say the first set runs with the i's in the order 0,3,2,1. If in the second set I have it read from pipe[2] when I need it when the second set has order 1,0,2,3 does it matter which one I'm in when I read it? Hopefully I asked that right. Basically, should I be concerned by the results I got above?
Last edited on
I’ll look at it later.

The exec*() functions are for loading and executing a different process. I have no idea why your professor would recommend that except if he expects you to create a new argv[1] for the exec*ed process. Also, it is not a friendly way to share pipe file descriptors, since you would have to have some way of sharing or assuming that information with the new process — a complete headache obviated by just using fork().
oh ok.
Well I played around a little more...
Here's the code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
int pipes[5][2];

for(int i = 0; i < 5; i++)
{
	pipe(pipes[i]);
}

int arr[] = {2,3,1,8};

	for(int i=0; i < 4; i++)
	{
		int pid = fork();
		if(pid == 0)
		{
			close(pipes[i][0]);
			printf("[son] pid %d from [parent] pid %d\n", getpid(), getppid());
			cout << "This is i's current value: " << i << endl;
			cout << "ARR: " << arr[i] << endl;
			write(pipes[i][1], &arr[i], SIZEINT);
			//close(pipes[i][1]);
			exit(0);
		}
	}
	for(int i=0; i<4; i++)
		wait(NULL);

	for(int i=0; i < 4; i++)
		{
			int pid = fork();
			if(pid == 0)
			{
				close(pipes[i][1]);
				close(pipes[i+1][1]);
				printf("============[son] pid %d from [parent] pid %d\n", getpid(), getppid());
				cout << "===========SET2: I's current value: " << i << endl;
				int val = 0;
				int val2 = 0;
				read(pipes[i][0], &val, SIZEINT);
				read(pipes[i+1][0], &val2, SIZEINT);
				close(pipes[i][0]);
				close(pipes[i+1][0]);
				cout << "========this is val from set 2: " << val << endl;
				cout << "========this is val2 from set 2: " << val2 << endl;
				cout << "========val1 + val2 = " << val+val2 << endl;
				exit(0);
			}
		}
		for(int i=0; i<4; i++)
			wait(NULL);


and this was the result from the last time I ran it:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[son] pid 15983 from [parent] pid 15977
[son] pid 15984 from [parent] pid 15977
[son] pid 15985 from [parent] pid 15977
This is i's current value: 3
ARR: 8
[son] pid 15982 from [parent] pid 15977
This is i's current value: 1
ARR: 3
This is i's current value: 2
ARR: 1
This is i's current value: 0
ARR: 2
============[son] pid 15989 from [parent] pid 15977
===========SET2: I's current value: 3
============[son] pid 15988 from [parent] pid 15977
===========SET2: I's current value: 2
============[son] pid 15986 from [parent] pid 15977
===========SET2: I's current value: 0
========this is val from set 2: 2
========this is val2 from set 2: 3
========val1 + val2 = 5
============[son] pid 15987 from [parent] pid 15977
===========SET2: I's current value: 1


This time I feel like I have an idea of what's going on, just not how to deal with it.
This is what I think is going on: It runs the part that checks my array[0] and array[1] and does the operation which in this case is a+b (2+3). But then those processes basically hog a and b and the next process it tries to run gets hung up waiting for it's turn using a or b.
The problem I have with this conclusion is that the output above says it tried index = 3 and 2 first, so why didn't they work? I mean 3 probably didn't work because then i+1 would be 4 and there is no array[4], but why didn't 2 work?

EDIT: Sorry for the weird formatting of the code above. That's just how it keeps ending up after I copy and paste it from Eclipse.
Last edited on
Are you saying do something more like his idea or mine?
Neither. I'm saying do something more like my idea.
Ok so I am trying to understand your idea. Is the main process a parent and all the other processes (Input and Internal) children?

And you say in the main process to read the first two lines of input to determine which processes to create. What happens when I run into something like p3 where I need the 3 lines? Would I just have the program check the next line to see if it has the same internal var process as the previous lines?

I think what I was currently trying is similar to what you are describing, but it's been giving me issues.
I just made it in its own test program where it only has to deal with the lines "a -> p0" and "-b -> p0".

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
string inN[] = {"a", "b"};
	int inV[] = {2,3};
	string intN[] = {"p0", "p1", "p2"};
	string lines[] = {"a -> p0",
			         "-b -> p0"};
	int pid;
	int pipes[2][2];

	pipe(pipes[0]);
	pipe(pipes[1]);

	int ind = lines[0].find(">");

	cout << lines[0].substr(ind+2) << endl;

	for (int i = 0; i < 2; i++)
	{
		string currLine = lines[i];
		string s;

		if(!isOper(currLine[0]))
		{
			pid = fork();

			if(pid == 0)
			{
				cout << "This is child from no-op" << endl;
				printf("[son] pid %d from [parent] pid %d\n", getpid(), getppid());
				int temp = 0;
				close(pipes[i][1]);
				read(pipes[i][0], &temp, SIZEINT);
				close(pipes[i][0]);
				write(pipes[i][1], &temp, SIZEINT);
				cout << "This is temp from no-op: " << temp << endl;
				exit(0);
			}
			else if(pid > 0)
			{
				cout << "This is parent from no-op" << endl;
				int val = inV[i];
				close(pipes[i][0]);
				write(pipes[i][1], &val, SIZEINT);
				close(pipes[i][1]);
			}
		}
		else if(isOper(currLine[0]))
		{
			pid = fork();

			if(pid == 0)
			{
				printf("[son] pid %d from [parent] pid %d\n", getpid(), getppid());
				cout << "This is child from op" << endl;
				exit(0);
			}
			else if(pid > 0)
			{
				cout << "This is parent from op" << endl;
				int temp = 0;
				close(pipes[i][0]);
				read(pipes[i][1], &temp, SIZEINT);
				close(pipes[i][0]);
				cout << "This is temp from op in parent: " << temp << endl;
			}
		}
	}


This is its output:
1
2
3
4
5
6
7
8
9
p0
This is parent from no-op
This is child from no-op
[son] pid 19135 from [parent] pid 19129
This is parent from op
This is temp from op in parent: 0
This is temp from no-op: 2
[son] pid 19136 from [parent] pid 1269
This is child from op


Doesn't attempt an operation yet. I'm trying to get it to pass values correctly first.
Alright, I know that the submission deadline has passed, but I just spent the past couple of nights playing with this assignment —as I understand it— and your professor is an a**.

I spent more time screwing around parsing that dependency graph than anything else.


The point of this homework is to initialize and instantiate a static dependency tree, where nodes are processes and leaves are pipes. The flow of data through the pipes is automatically managed by the dependencies themselves. The hard part is connecting the pipes before creating subprocesses.

A very simple example:
input_var a,b;
internal_var p0;
  a -> p0;
+ b -> p0;
write(a,b,p0).

That's three nodes: a, b, and p0. (Four nodes if you count the parent process.)
But that's also five pipes:
+---+                   +---+
| a |                   | b |
+---+                   +---+
 | |        +----+       | |
 | +------->| p0 |<------+ |
 |          +----+         |
 |            |            |
 |            v            |
 |         +------+        |
 +-------->|parent|<-------+
           +------+

Remember, your parent process also gets output from each node. Notice how 'b' has an output pipe to both 'p0' and to the parent process?

Had the 'write' statement read write(a,p0). you would not have that second pipe leading back to the parent process:
+---+                   +---+
| a |                   | b |
+---+                   +---+
 | |        +----+       |
 | +------->| p0 |<------+
 |          +----+
 |            |
 |            v
 |         +------+
 +-------->|parent|
           +------+

There are a few caveats, too. You were not pointed to the fdopen() function, but you probably want that. Likewise, you must make sure to flush your output every time you write to pipe or it will not be readable on the other end. Finally, and this is important:

    all pipes must be created before any subprocess is forked.

That is, you should have the entire tree figured out and ready before you begin forking.

I designed my tree with a simple struct, like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
typedef struct var
{
  char    name[ 10 ];     // The name, like 'a' and 'p0'.
  vartype type;
  
  FILE*   inputs [ 10 ];  // read end of a pipe
  int     ninputs;

  FILE*   outputs[ 10 ];  // write end of a pipe
  int     noutputs;

  // input_var
  double  value;
  
  // internal_var
  char    mathops[ 10 ];
}
var;

In main(), I keep a static array of var:
1
2
  var vars[ 30 ];
  int nvars = 0;

Finish that with a couple of functions to create and find a var:
1
2
  var create_var( const char* name, vartype type );
  var* find_var( const char* name, var* vars, int nvars );

Now I have a variable lookup table, and I can identify what kind of statement I got the var from: 'input_var', 'internal_var', or 'write'.

I have a function that takes a line of text and strtok()s it into a list of names and creates a var for each name with the appropriate type. I call it once each for the first line (input_var), the second line (internal_var), and the last line (write).

Next is to process the dependencies. I have a (not-so-little, about 40 lines long) function that takes a line of input and decodes the stupid dependency syntax into three things:
1
2
3
  char  mathop
  char* varname0
  char* varname1

Once found, I look up the names to find the vars, complain if they don't exist or are not the correct type, and then create the pipe that connects the two vars:
1
2
3
4
5
6
7
8
9
  int fds[ 2 ];
  if (!pipe( fds )) fooey();

  v0->outputs[ v0->noutputs ] = fdopen( fds[ 1 ], "w" );
  v0->noutputs += 1;

  v1->inputs [ v1->ninputs  ] = fdopen( fds[ 0 ], "r" );
  v1->mathops[ v1->ninputs  ] = mathop;
  v1->ninputs += 1;

This same hookup is done with the list of 'write' vars.

So far, my main() looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
  char s[ 1000 ];
  var vars[ 30 ];
  int nvars = 0;

  // input_var
  if (!readline( f, s, sizeof s, ';' )) fooey();
  parse_varnames( s, vars, &nvars, input_var );
  
  // internal_var
  if (!readline( f, s, sizeof s, ';' )) fooey();
  parse_varnames( s, vars, &nvars, internal_var );

  // dependencies
  while (readline( f, s, sizeof s, ';' ))
    if (!parse_dependency( s, vars, nvars ))
      break;

  // write()
  int nwritevars = nvars;
  parse_varnames( s, vars, &nwritevars, write_var );
1The readline() function is a utility function I wrote to properly input a line ending on a given separator. You could just use fgets() directly in only a couple more lines of code each.)
2Also, I wanted to keep the write vars separate, hence the separate counter.




At this point we are ready to create our subprocesses.

The forking is pretty simple. The new process does not use main(). It should instead use a simple function and then terminate. I wrote a little procedure to help me fork. Here's a version that is fairly easy to read:
1
2
3
4
5
6
7
8
9
10
11
12
void fork_var( var* v )
{
  switch (fork())
  {
    case 0:
      if (v->type == input_var)    exit( input_var_proc( var ) );
      if (v->type == internal_var) exit( internal_var_proc( var ) );
      
    case -1: 
      fooey();
  }
}

If the fork fails, we just crash hard (fooey()).
The child determines which function to call, calls it, and terminates.
The parent simply returns from the function and pretends not to care that it just spawned a child.

Those two little functions are the core of your subprocesses. Here's my input var proc in all its glory:
1
2
3
4
5
6
7
int input_var_proc( var* v )
// An 'input_var' proces simply prints its stored value to all of its output pipes and terminates
{
  while (v->noutputs--)
    fprintf( v->outputs[ v->noutputs ], "%g\n", v->value );  // '\n' is important!
  return 0;
}

That is, an 'input_var' process does nothing but print its stored value to all its output pipes and then terminates.

The 'internal_var' process is only slightly more complicated: it only needs to read from all its input pipes, in order, and apply the mathematical operation to a local variable. When done, it needs to print the resulting value to all its output pipes and terminate.

tl;dr: This is basically a loop with a switch statement for the read + mathop, followed by another loop just like in the input_var_proc.


Back to main(), we can effect all this forking and pipe I/O with a simple loop:
1
2
  for (int n = 0; n < nvars; n++)
    fork_var( vars + n );


The only thing remaining to do is print all the variables asked for in the 'write' statement:
1
2
  for (int n = nvars; n < nwritevars; n++)
    printf( "%g\n", read_double( vars[ n ].inputs[ 0 ] ) );

A good programmer would also be encouraged to go back and close all those pipe ends, but terminating the process does that automatically, so I didn't bother for either the parent or any subprocess. ;->


There is one other thing I did to make my life a lot simpler: utility functions. Like the that 'read_double()' function you see me using above. And the 'readline()' function to get a line of text from the dataflow graph. It makes it so much easier to program.

Really, the hardest part was parsing the dataflow graph. Once the whole thing is parsed, forking the child processes and letting them sort themselves with pipe I/O is easy.

I hope this is helpful.
Pages: 12