Counting Characters in a text file????

Hello, I am working on an assignment and the objective is:
Your assignment is to read a file containing a DNA
sequence and determine
(1) the total number of bases in the sequence
(2) the total number of A bases
(3) the total number of G bases
(4) the total number of T bases
(5) the total number of C bases
(6) You also need to generate a graph of the distribution of the 4 bases in the sequence. Since DNA sequence files
are large with thousands of bases, for generating the graph divide your base counts by 100 so facilitate plotting of
a reasonable number of bases. Note, the values will be truncated in graph (e.g. as seen below, the total count for
base A is 1176, but the graph shows 11 A’s (which is 1176/100.0= 11.76 truncated to 11).

The guidelines are: (1) The input should be read from a file. Note that the bases in the input files are in lowercase letters, a, g, t, and c. Your
program should check if input file failure and end with an appropriate message if an error occurs in reading the file.
Make use of the eof() function when reading since you do not know how many bases are there in each file. You can
use the eof() function to determine when you have finished reading the last line of data in the file. When you use eof(),
remember that eof () returns true only when you try to read data after you have reached the end. For example,
suppose that you have a file that has only has 1 line of data. The eof flag will not set when you read that line, but after
you try to read again. Example usage, if you have a ifstream variable call infile, then you can call the function as
follows: infile.eof(). infile.eof() is false starting from the first line in the input file until the last line. It is false even after
you finish reading data on the last line. It becomes true only when you try to read data again after you have finished
reading the last line. Three different input files (DNAsequence1.txt, DNAsequence2.txt, DNAsequence2.txt) are
provided for you to test your code.
(2) The output should be written to file. Format the output to be neatly aligned in columns as shown above. For both the
report of the number of bases and the graph, you must use the manipulators, setw, setfill, left, and right, etc.
(3) Use of switch statement is recommended (for example when counting the individual bases of type ‘a’, ‘c’, ‘g’ and ‘t’
(4) Remember to close all files when done.

So far my program looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
#include <iostream> // must include both to read from file
#include <fstream>
#include <cstdlib>
#include <string>
using namespace std;

int main()
{
	char a, g, t, c;
	int totalbases;

	ifstream infile;
	ofstream outfile;

	infile.open("DNAsequence1.txt");
	outfile.open("DNAreport1.txt");

	if ( ! outfile.good() )
	{
		cout << "could not open output file" << endl;
		exit (1);  
	}
////////////////////////////////////////////////////////////////////////////////


   int letters[26]; // array to count letters
  
   //initialize arrays to zero
   for(int i = 0; i < 26; i++) letters[i] = 0;
   
 
   ifstream file;
   string filename;
   string line;
   int len;



	////////////////////////////////////////////////////////////////////////////////

	   while(!infile.eof())  
   {
      getline(infile, line);
      len = line.length();
      if (len == 0) continue; // skip blank lines
       
      for(int i = 0; i < len; i++)
      {
         char c = line.at(i);
         if (c >= 65 && c <= 90)
         {
            letters[c - 65]++;
         }
         else if (c >= 97 && c <= 122)
         {
            letters[c - 97]++;
         }
      }
   }

   file.close();

   // print results;
   cout << endl;

   for (char c = 'A'; c <= 'Z'; c++)
   {
      if (letters[c - 65] > 0) cout << c << " : " << letters[c - 65] << endl;
   }

 
////////////////////////////////////////////////////////////////////////////

	outfile << " REPORT on DNA sequence " << endl
	<< " Total number of bases: " << endl
	<< " Total number of A: " << a << endl
	<< " Total number of G: " << g << endl
	<< " Total number of T: " << t << endl
	<< " Total number of C: " << c << endl;
	
	
	
	
	
	
	
	
	
	infile.close();
	outfile.close();

	cout << "\nDone\n" << endl;
	return 0;



However I can't get it to results to output in the text file, instead the output is shown on the screen.
I am just learning how to program and this assignment has me stumped as I have been trying for hours with no success.
Any help or advice would be greatly appreciated, thank you very much.
However I can't get it to results to output in the text file

You could be more helpful. You get no output file? You don't get what you expect in your output file? When you try to write to the output file on lines 74 through 79, you use the variables a, g, t and c. Only one of those variables is assigned any value in your program (your compiler probably warns you of this,) and it's not the value you appear to be wanting to write to the output file.
Last edited on
Topic archived. No new replies allowed.