i have no idea how to begin with this problem

The Human Genome Project is an international effort to discover all of the approximate 30,000-100,000 human genes in the human genome (i.e. the total DNA). The project successfully completed the sequencing of the human genome in 2001 and the complete sequence of the 3 billion DNA subunits (called bases) was published in 2003. The results of this project, i.e. the DNA sequences are publicly disseminated in an effort to lower the barriers to effective biomedical research. The Biotechnology Program at the UH College of Technology needs your assistance in analyzing DNA sequences. In terms of computer terminology, DNA is a base-4 labeling system. This labeling system uses the letters A, T, C, and G, for the four DNA bases of adenine (A), thymine (T), cytosine (C), and guanine (G). This sequence is determined using automated technology and typically errors in reading a base are typically denoted by the letter N. Your assignment is to read a file containing a DNA sequence and determine
(1) the total number of bases in the sequence
(2) Number of errors
(3) the total number of A bases
(4) the total number of G bases
(5) the total number of T bases
(6) the total number of C bases
(7) You also need to generate a graph of the distribution of the 4 bases in the sequence. Since DNA sequence files are large with thousands of bases, for generating the graph divide your base counts by 100 so facilitate plotting of a reasonable number of bases. Note, the values will be truncated in graph (e.g. as seen below, the total count for base A is 1176, but the graph shows 11 A’s (which is 1176/100.0= 11.76 truncated to 11).

Your program should output a report to an output file with the information listed above and shown below. Sample input and output file:
DNAsequence1.txt (only a portion of the file is shown) DNAreport1.txt


Program Requirements:

(1) The input should be read from a file. Note that the bases in the input files are in lowercase letters, a, g, t, and c, and an error is denoted by n. Your program should check for input file failure and end with an appropriate message if an error occurs in reading the file. Make use of the eof() function when reading since you do not know how many bases are there in each file. You can use the eof() function to determine when you have finished reading the last line of data in the file. When you use eof(), remember that eof () returns true only when you try to read data after you have reached the end. For example, suppose that you have a file that has only has 1 line of data. The eof flag will not set when you read that line, but later when try to read again in the next iteration. Example usage, if you have a ifstream variable called infile, then you can call the function as follows: infile.eof(). infile.eof() is false starting from the first line in the input file until the last line. It is false even after you finish reading data on the last line. It becomes true only when you try to read data again after you have finished reading the last line. Three different input files (DNAsequence1.txt, DNAsequence2.txt, and DNAsequence3.txt) are provided for you to test your code.
(2) The output should be written to file. Format the output to be neatly aligned in columns as shown above. For both the report of the number of bases and the graph, you must use the manipulators, setw, setfill, left, and right, etc.
(3) Use of switch statement is recommended (for example when counting the individual bases of type ‘a’, ‘c’, ‘g’ and ‘t’ and errors
(4) Remember to close all files when done.
This is not a homework site. You have shown zero effort for trying to solve this problem on your own. Someone might be nice and help you, but chances are slim. If you are looking for someone to do your homework for you, try the Jobs section.

If you are being sincere in that you have absolutely no idea how to start, please begin with the site's C++ tutorial: http://www.cplusplus.com/doc/tutorial/
Also recommmend a good book: https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list

Otherwise, show what code you have tried so far, and what specific errors or bugs you're getting.
Last edited on
someone please ban the above user ^^ (comologia) for spam and trying to to plug their website which has nothing to do with programming
Last edited on
Topic archived. No new replies allowed.