convert text file to graph using c++

Forum

Forum
Beginners
convert text file to graph using c++

convert text file to graph using c++

i have to read data from file and translate it to graph in c++

the textfile looks like this

INPUT(G1)INPUT(G2)INPUT(G3)

INPUT(G6)INPUT(G7)OUTPUT(G22)

OUTPUT(G23)G10 = nand(G1, G3)G11 = nor(G3, G6)
G16 = and(G10 ,G2, G11)G19 = or(G11, G7)G22 = xor(G10, G16)
G23 = xnor(G16, G19)
which INPUTS and OUTPUTS are nodes...for instance INPUT(G1) is vertex that named G1 or OUTPUT(G22) also is vertex with name G22, etc.

and the functions just produce edges or connect the input and output vertices... e.g. G16 = and(G10 ,G2, G11) forms 3 edges which are G2 to G16, G11 to G16 and G10 to G16 or for G10 = nand(G1, G3) it connects G1 to G10 and G3 to G10, and so on.

i've written some code which i think it doesn't even close to what i want... i'm beginner programmer...so...would you please give me some hints? your help will greatly appreciated

#include <iostream>
#include <fstream>
//#include <sstream>
#include <string>
#include <list>


using namespace std;

class node 
{
private:
	string name;
	double weight;
public:
	node (string a , double d): name(a) , weight(d){}
	string vertex_name() const {return name;}
	double delay() const {return weight;}

};
class edge
{
public:
	static int NoV; //number of vertices
	list <node> *node1;   // pointers of nodes

	edge(){NoV = 0;}
	edge(int nov)   //constructor
	{
	this -> NoV = nov;
	node1 = new list<node> [nov];
	}
	void add_edge(string e, string b, double d);
	
};	
		
void edge::add_edge(string e, string b, double d)
{
	node n1(b, d);
	node1[e].push_back(n1); //add e to nodes list
					
}

int main ()
{
	
ifstream input;
    string line;
	string start_vertex;
	string end_vertex;
	input.open("circuit.txt");
	while(!input.eof())
	{
		input >> line;
		// just write the code below...i have no idea :)
		if (line == "INPUT" || line == "OUTPUT")
		{
			node G1(line,1.5);

		}
	 
		else if (line == "nand" || line == "and" || line == "xor" 
								|| line == "or" || line == "nor"
								|| line == "xnor")	
			edge arc(11);
		start_vertex = line;
		input >> line;
		end_vertex = line;
	}

	input.close();
	 

	cout << "circuit has (" << edge::NoV <<") nodes"  ;
	cout << " vertices are " << start_vertex << " " << end_vertex ;

return 0;
}

Duthomhas (13119)

There are several issues here.

The first is that a graph can be represented in all kinds of ways. A graph is an abstract concept. The trick, then, is to represent it using a structure that makes your manipulations of it convenient.

In the description you have provided, you list three characteristics of nodes:

• name (like “G23”)
• zero or two linked nodes
• type (“INPUT”, “OUTPUT”, “NAND”, etc)

Your structure should also have these things. Naïvely, you might write:

struct node
{
  std::string name;
  ??? linked[2];
  ??? type;
};

There is a difference between these things, though. Notice that a node is actually referenced by its name. For example, you NAND two INPUT nodes by referencing them by name:

nand(G1,G2)

It might help to do the same. (And, as it turns out, this is actually a very good way to do it.)

Your graph might then want to store nodes in a std::map:

typedef std::string NodeName;
struct Node
{
  ??? links[2];
  ??? type;
};
typedef std::map <NodeName, Node> Graph;

This makes our links very easy references as well — we just need the name of the node being linked:

typedef std::string NodeName;
struct Node
{
  NodeName links[2];
  ??? type;
};
typedef std::map <NodeName, Node> Graph;

What this also does is make it very easy to process the graph:
    for each OUTPUT node in the graph:       cout << evaluate( the node ) << "\n";
And
    func evaluate( node ):       switch (graph[node].type)         case nand: return ! (evaluate(graph[node].links[0]) && evaluate(graph[node].links[1]));         case ...
The type of a node might be any useful object, even a string. You could easily wow your prof by using an enum, which would work very well in that evaluate() switch.

The last trick is reading your input into the graph. (Which is what you are asking about...)
Remember, a graph is however you want to represent it.

Put another way, you are not translating the input into C++; rather you are translating it into however you represent it.

For a map of nodes, as I have suggested, this because really, really easy. Everything has a very specific format:

    [NAME '='] TYPE '(' LINK0 [',' LINK1] ')'

Neither C++ nor C make any functions that make parsing this kind of stuff particularly easy straight from file. You will probably want to write a few functions that ask for a specific thing. Like:

std::string get_name_or_type( std::istream& f )
// Skips whitespace, returns all alphanumeric characters until encountering one of:
// space, equal sign, comma, or parenthesis (left or right). (Or EOF.)
{
  int c;
  std::string s;
  f >> std::ws; // skip whitespace

  // Read a character. While character is alphanumeric, append to s.
  while (f.get( c ) && std::isalnum( c )) s += c;

  // The last character read was not alphanumeric. 
  // We'll want to read it again, so put it back in the stream.
  f.putback( c );

  return s;
}

bool is_read_equal( std::istream& f )
// If the next character in f is an equal sign, read it and return true.
// Return false otherwise.
{
  if (f.peek() != '=') return false;
  f.get();
  return true;
}

This kind of code is the basis of how a standard Recursive Descent parser is designed.
As a side note, several modern compiler writers have rediscovered that Recursive Descent is an exceedingly powerful parsing method. Microsoft, for example.

Your reading code now becomes exceedingly simple.
while (true): std::string name, type = get_name_or_type( myfile ); if (!myfile) break; // all done if (is_read_equals( myfile )): name = type; type = get_name_or_type( myfile ); if (!is_read_open_paren( myfile )) complain_and_quit(); ...
and so on.

The graph need not be represented this way. You can represent it as a list or vector of values. You can use the self-created node types. You can use an array. The associate array (or std::map) is just likely your easiest bet. If you must design something yourself, design something that approximates that.

struct Node
{
  string links[2];
  NodeType type;
};
struct NodeReference
{
  string name;
  Node* node;
};
NodeReference graph[ MAX_NODES ];
unsigned long num_nodes = 0;

I know this is a lot to take in. Take your time and things will become clear.
Hope this helps.

[edit]
Heh, I just wrote this at home...

My input file [output = and(input1,or(input2,input3))]:

output(g0)
g0 = and(g1,g4)
g4 = or(g2,g3)
input(g1)
input(g2)
input(g3)

And my run...

C:\m\Programming\cpp\random\a>a t1.txt
0 0 0
0

C:\m\Programming\cpp\random\a>a t1.txt
0 0 1
0

C:\m\Programming\cpp\random\a>a t1.txt
0 1 0
0

C:\m\Programming\cpp\random\a>a t1.txt
0 1 1
0

C:\m\Programming\cpp\random\a>a t1.txt
1 0 0
0

C:\m\Programming\cpp\random\a>a t1.txt
1 0 1
1

C:\m\Programming\cpp\random\a>a t1.txt
1 1 0
1

C:\m\Programming\cpp\random\a>a t1.txt
1 1 1
1

C:\m\Programming\cpp\random\a>type t1.txt

Heh heh heh...
Did it in about 150 lines, which are a few more than strictly necessary, but I like things clean... Additional output for errors:

output(g0)
input(g1,g2)
g7=and(g3)
↓
ERROR: node U1:INPUT expected close parenthesis

output(g0)
input(g1)
g7=and(g3)
↓
ERROR: node G7:AND missing second link

Here’s how to do enums:

#define NODE_TYPES(F) \
  F(INPUT, 1) \
  F(OUTPUT,1) \
  F(AND,   2) \
  F(NAND,  2) \
  F(OR,    2) \
  F(XOR,   2) \
  F(NOT,   1)
  
#define F(a,b) a ## _NODE,
typedef enum { NODE_TYPES(F) NumNodeTypes } NodeType;
#undef F

#define F(a,b) # a,
const char* NodeNames[] = { NODE_TYPES(F) };
#undef F

#define F(a,b) b,
int NumNodeLinks[] = { NODE_TYPES(F) 0 };
#undef F

LOL, It was fun!

Last edited on

ETH11674 (20)

thank you for your time...would you please elaborate more on parser part...its quite confusing for me...if it possible give me more detail please...thanks a million...

Duthomhas (13119)

A parser for a hard grammar expects certain input.

From what I can figure (and I could be wrong) your grammar looks like any number of node definitions, where a single node definition looks like:

    [name '='] type '(' name [',' name] ')'

(Brackets indicate a sequence is optional. Name and type are user-defined strings (like “G3” and “OUTPUT”). Quoted items are exact.

So, here are some of your examples:

    INPUT(G1) → no name, type=“INPUT”, link0=“G1”.

    G10 = nand(G1, G3) → name=“G10”, type=“nand”, link0=“G1”, link1=“G3”

To read these things you'll need a function that reads a string of alpha-numeric characters only. Use that to get the first part. If the next character after that string is an '=', then the string is a name and you'll have to read again to get the type. Otherwise you already have the type. Create a fake name (like “U1”, “U2”, etc.)

Likewise, you should next encounter a '(', followed by a name (your first link). If the next item is a comma ',', you should next have a second name (the second link). Otherwise it should all end with a close parenthesis ')'.

That's the end of a node, and you have all the pieces needed to add it to the graph.

Repeat until you encounter EOF (which will happen when you try to read the first alpha-numeric string for the next node).

My code looks something like this:

std::istream& read_name_or_type( std::istream& ins, std::string& s )
{
  // this is very much like getline(), except that it only reads alpha-numeric.
  char c;
  s.clear();
  ins >> std::ws;

  // get characters until EOF/failure or until we get a character we don't want
  while (ins.get( c ) and std::isalnum( c ))
    s += std::toupper( c );

  // put back the one we don't want and return the istream, just like getline()
  return ins.putback( c );
}

Graph read_graph( std::istream& f )
{
  Graph G;
  Node node;
  std::string name;
  std::string s;

  while (read_name_or_type( f, s ))
  {
    char c;
    f >> c;
    if (c == '=')
    {
      // found an '=', so name really is name. We need to read the type...
      name = s;
      if (!read_name_or_type( f, s )) return complain( "expected TYPE" );
      node.type = string_to_type( s );
      // ...and again get the next non-whitespace character
      f >> c;
    }
    else
    {
      name = generate_next_unique_name();
      node.type = string_to_type( s );
    }

    if (c != '(') return complain( "expected '('" );

    // remainder is exercise for you. What comes next?
    // What may come after that?
    // Etc.
    ...

    // Finally, if we get this far, we can add the node to the graph.
    // I used a std::map<Node> for the Graph, so I said:
    G[ name ] = node;
    // If you use some other construct, such as an 'add()' method:
    G.add( name, node );
    // or an add function:
    add( G, name, node );
    // whatever the case, you must operate the graph and nodes according to however you designed it
  }

  return G;
}

Once again, the design of your graph matters. I obviously like the std::map design here, but that is not the only way to do it. Heck, you could have a bunch of arrays.

std::string node_names[MAX_GRAPH_NODES];
node_type   node_types[MAX_GRAPH_NODES];
std::string node_link0s[MAX_GRAPH_NODES];
std::string node_link1s[MAX_GRAPH_NODES];
unsigned num_nodes = 0;

Then adding a node is just putting stuff in the right place as you go:

node_names[ num_nodes ] = s;
node_types[ num_nodes ] = string_to_type( s );
read_name_or_type( node_link0s[ num_nodes ] );
read_name_or_type( node_link1s[ num_nodes ] );

and finally updating it as included in the list:

num_nodes += 1;

However you do it, pick some way and stick to it.

Parsing the nodes is actually harder than processing them. Here's an actual piece of the code I wrote:

bool evaluate( Graph& G, const std::string& name )
{
  switch (G[ name ].type)
  {
    case INPUT_NODE:  return get_input();
    case OUTPUT_NODE: return   evaluate( G, G[ name ].links[0] );
    case NOT_NODE:    return  !evaluate( G, G[ name ].links[0] );
    ...

And, since it bears repeating: the trick to programming all this stuff is to decide how to store and manipulate your data.

I used an enum to handle node types, which let me use that switch statement. You could just as easily use a string for the node types and an if...else if... sequence.

I used a std::map so I could look up a node by name. You could build a tree using node*s. Use arrays. Use one array for all the nodes and use indexes into the array for links.

This is where programming is interesting. You design how it works.

(Most of the stuff I wrote is a little less pedantic than what you see here, but that is only because I have a greater facility with C++ than you do yet, so I'll stick to trying not to confuse you by skipping steps with awesome language constructs.)

Hope this helps.

Last edited on

ETH11674 (20)

thank you dear Duthomhas...that's a huge help
thank you very much

Duthomhas (13119)

(Responses to PM:)

It doesn't get any easier. (This is about as easy as it gets.)

NAND and OR are absolutely not the same!


NAND = !(evaluate(left) && evaluate(right));
OR   =   evaluate(left) || evaluate(right);

Since && and || are short-circuit operators in C++, I do recommend that you take care to evaluate both sides separately. You can do it right in the switch statement or you can create a little function to encapsulate it for you:

bool do_NAND(...)
{
  bool left_result  = evaluate( links[0] );
  bool right_result = evaluate( links[1] );
  return !(left_result && right_result);
}

  case NAND_NODE: return do_NAND( ... );

Edges are nothing more than a way to find another node. I'm not sure where you are hung up on this, and I am still not sure I completely understand your assignment, but as I can figure from what you have put here so far your graph works something like this:

How it is represented in the input text file

OUTPUT(G1)
INPUT(G2)
INPUT(G3)
INPUT(G4)
G5=OR(G3,G4)
G1=NAND(G2,G5)

How it is represented if you draw it


  ╔══╤══════╗    ╔══╤════╗           ╔══╤═════╗
  ║U1│OUTPUT╟────╢G1│NAND╟───────────╢G2│INPUT║  
  ╚══╧══════╝    ╚══╧══╤═╝           ╚══╧═════╝
                       │                       
                       │             ╔══╤═════╗
                       │       ┌─────╢G3│INPUT║
                       │  ╔══╤═╧╗    ╚══╧═════╝
                       └──╢G5│OR║
                          ╚══╧═╤╝    ╔══╤═════╗
                               └─────╢G4│INPUT║
                                     ╚══╧═════╝

How I have represented it using a std::map

{
  "U1": { type=OUTPUT, links={"G1"      } },  
  "G1": { type=NAND,   links={"G2", "G5"} },
  "G2": { type=INPUT,  links={          } },
  "G3": { type=INPUT,  links={          } },
  "G4": { type=INPUT,  links={          } },
  "G5": { type=OR,     links={"G3", "G4"} }
}

These are all the same graph. Remember, a graph is an abstract object. Each representation of the graph has its uses.

The file representation makes it easy to read and write with a computer program.
The drawn representation makes it easy for us humans to look at.
The map representation makes it easy for the C++ computer program to store and manipulate it.

In all cases, the name of a node is the same as an edge. So from node “G5” I can refer to node “G3” simply by knowing its name. Knowing a name is the same as an edge, because you can access a node with its name.

Traversal then becomes super mega easy. Start with an OUTPUT node and follow the names.

Your assignment has two major parts:
• read the text file into a representation of your graph in the computer (I used a map)
• traverse the graph (recursively) to compute node values

There is also the issue of how input is obtained. What you have posted has not been clear about that. I can only presume that inputs are in the same order as they are named and/or listed:

Given:

INPUT(G1)INPUT(G2)INPUT(G3)
INPUT(G6)INPUT(G7)

And an input of:

Then I presume that the input nodes have values:
G1: 1 G2: 0 G3: 0 (skipping 1 and 0) G6: 0 G7: 1
This is all purely guesswork on my part.

ETH11674 (20)

bool do_NAND(...)
{
  bool left_result  = evaluate( links[0] );
  bool right_result = evaluate( links[1] );
  return !(left_result && right_result);
}

i understand what you've done above...but the thing that i want is little different
by the Same i mean each gate act like black box just connect inputs node to the output node...in other words i want to the timing graph equivalent to the circuit

i.e. if we have a gate that has 3 inputs it just connect them to the output node...we don't give value to the inputs!!!

my final goal is to obtain a DAG graph from the text file and then assign each edge a delay(weight) then
find the shortest and longest path from primary inputs to primary output...this is why i said NAND and OR are the same because both of them has 2 inputs ,1 output...so they both generate 2 edges(same for MUX 4*1 and any other logic components )

for MUX 4*1 it just connects all inputs to its output by directed edge with different or similar weight...etc.

my problem is the first step(generate DAG graph from the text file)
although your valuable suggestions help me a lot but not specifically what i want
i've already written some functions to detect shortest and longest paths for any DAG graph

we don't need such functions (do_nand, do_xor ,....)because we're not gonna initialize the inputs to calculate the output value.

we just need function that if it sees such types
(x4=nand(x1,x2,x3),a4=xor(a1,a2,a3), y4=or(y1,y2,y3)) or any other types with same inputs#
in text treated them in same manner...i.e. all 3 types above are the same(must yield same outcome)
that is all my assignment about...

it looks like i have to read more about graphs...i got the main idea of your code...but it's hard for me to understand its details:(
once again thanks for your attention and time.

Duthomhas (13119)

Ah, well, that changes things a lot.

Your original posting did not make that clear to me.

Add another field to your nodes indicating the weight of the edge.

struct Edge
{
  std::string target_node;  // Name of targeted node
  int         weight;       // Edge weight
};

struct Node
{
  std::string node_type;  // INPUT, OUTPUT, NAND, etc
  std::vector <Edge> edges;  // List of edges
};

typedef std::map<std::string, Node> Graph;

You may notice that this is already a directed graph, and that the number of edges attached to a node is variable. Following the DAG works as before — no need to evaluate boolean stuff — instead you are doing a shortest/longest path traversal.

Another common way to represent a DAG is using a square matrix.
For example:

int weights[4][4] =
{ //G1, G2, G3, G4
  {  0,  5,  0,  0 }, // G1 --> G2(weight=5)
  {  0,  0,  1,  2 }, // G2 --> G3(weight=1) and G4(weight=2)
  {  0,  0,  0,  0 }, // G3 --> nothing
  {  0,  0,  0,  0 }, // G4 --> nothing
};

As you can see, this is a fairly sparse matrix, which becomes an issue when the number of nodes grows large. (space required is sizeof(node)*N*N)

ETH11674 (20)

i think it's efficient to use std::map instead of matrix.
may i have the code you've written(just the first part) in first place please?
i'm receiving several weird errors when i build my program and it breaks out!

Duthomhas (13119)

Most of the code I’ve written here is incomplete. It is designed to give you a structure to work from. It is not designed to work as-is.

For example, I used a function complain( "expected '('" );. This function doesn’t exist. It doesn’t necessarily have to exist. All it does is say “this is where you handle an error about a missing open parenthesis”. You could just return. Throw an exception. Print a bunch of smiley faces and quit. It is up to you.

Part of the power of a parser like this is that it always knows exactly what is expected, so it can complain with clarity and exactness about what is wrong with the input when an error occurs.

Things like an ellipsis (three periods, “...”) mean “stuff omitted here”.

IDK if it makes any difference for what I’ve put here, but you should also be compiling with C++14, or C++11 at minimum.

Reading input is not an easy task, even though so many people treat it as if it were. Unfortunately, you have to think carefully about how it works.

1 Read a name
2 No: end of input
3 Read an equal sign?
4 Yes: read the type
5 No: The name was the type
6 Read an open parenthesis?
7 No: Error
8 Read a name
9 No: Error
10 Read a comma?
and so on

I keep repeating myself in different ways so that it will kind of sink in at some point, and you’ll say oh, I get it and be able to write your own reader.

I know this is frustrating. And hard. Chances are half you classmates are watching this thread closely themselves because they are just as lost about it as you think you are.

I will not give you an answer to turn in as your own work. I’m an ornery jerk. I want you to figure it out and understand it yourself. And I’ll do my best to drag you along until you do. Because once you do, you will be one step ahead of people taking classes three levels above you, because they never quite got this right. And you’ll be much more pleased with your ability both now and in the future.

Take it in steps. Right now, all your program has to do is compile read the one piece of input without failing.

[string '='] string '(' string [',' string] ')'

Whether or not there is an '=', you need to try to read a string first.
If there was an '=', then the string is the node name.
Otherwise it was the type.

If it was the node name, then you need to read another string to get the node type.

The next thing you should get is an open parenthesis '('. If it isn't, complain about a bad input file format and quit.

Read another string. This is your first edge link.

The next piece of punctuation you read tells you what comes next. If you read a comma ',', there's another edge name there. Read the new name and append it to your list of node names, and read the next piece of punctuation.

At this point the last thing you read should be a close parenthesis ')'. If it isn't complain and quit.

If you haven't quit by now, you're golden. Add the new node to your graph.

Then start over to get the next node.

ETH11674 (20)

thank you for your valuable tips...i assure you, i'll do it...i'll post it if i did ;)

Topic archived. No new replies allowed.