Translating

closed account (GybDjE8b)
2) Identify the start codon: In order for RNA to be translated into
amino acids, it must be in a coding region. Therefore, we will not start
looking for the code for tryptophan until we see the start codon.
The start codon we are looking for is AUG. When you find it, print something
out like "The coding region starts at position X", where X is the position
of the array where you find the first letter of the start codon.

I don't understand how to do this i tried making the program but it giving 333 instead of just 3. can somebody help or explain?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#include <iostream>
#include <string>
using namespace std;

void codon (string start)
	{
		int position = 0;
		char x = 'A';
		for (int i = 0; i < start.length(); i++)
	{
		if (start == "AUG" )
		{
			for (int j = 0; j < start.length(); j++)
			{
				if (start[j] = x)
				{
					position = j;
				}
				position = j+1;
			}
			
		}
		cout << position;
	}
};
void codon (string);
int main()
{
	char Adenine = 'A';
	char Guanine = 'G';
	char Cytosine = 'C';
	char Thymine = 'T';
	string strand;
	
	cout << "Please input your DNA:" << endl;
	cin >> strand;
	cout << "This is the RNA sequence your entered: " <<strand << endl;
	codon(strand);
	
	system ("PAUSE");
	return 1;
}
@darkflames33

You can't really look for your sequence that way. Check for each letter individually, as I have here, below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <iostream>
#include <string>

using namespace std;

void codon (string);

int main()
{
 char Adenine = 'A';
 char Guanine = 'G';
 char Cytosine = 'C';
 char Thymine = 'T';
 string strand;

 cout << "Please input your DNA:" << endl;
 cin >> strand;
 cout << "This is the RNA sequence your entered: " <<strand << endl;
 codon(strand);

 system ("PAUSE");
 return 1;
}

void codon (string start)
{
 int position = 0, len = start.length();
 char x = 'A';
 for (int i = 0; i < len; i++)
 {
	if(start[i] == 'A' && start[i+1] == 'U' && start[i+2] == 'G')
	   position=i+1; // Add 1 since the checking starts with a 0
 }
if(position > 0)
  cout << "The coding region starts at position " << position << endl;
else
  cout << "Could not find the codon region..." << endl;
};
Last edited on
closed account (GybDjE8b)
Find the instances of tryptophan, and the end codon: After finding the
start codon, you will be looking for the codon for tryptophan, which is UGG.
You will look until you either reach the end of the input, or see the stop
codon (UAG). After the start codon, each group of 3 bases makes a codon.
Then, you will print out how many times tryptophan is coded for, and where
the coding region ends.

Is there a way for me to break it into 3 bases after the codon? and would this be right if i see UAG it will stop
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void codon (string start)
{
 int position = 0, len = start.length();
 char x = 'A';
 for (int i = 0; i < len; i++)
 {
	if(start[i] == 'A' && start[i+1] == 'U' && start[i+2] == 'G')
	   position=i+1; // Add 1 since the checking starts with a 0
       else(start[i] == 'U' && start[i+1] == 'A' && start[i+2] =='G'
            break;
 }
if(position > 0)
  cout << "The coding region starts at position " << position << endl;
else
  cout << "Could not find the codon region..." << endl;
};
Finding a sequence in a string use the find function:

http://www.cplusplus.com/reference/string/string/find/

example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
std::string::size_type aug_pos = start.find("AUG");
if(aug_pos != std::string::npos)
{
  cout << "The coding region starts at position " << aug_pos + 1 << endl;
  std::string::size_type uag_pos = start.find("UAG", aug_pos + 3);
  if(uag_pos != std::string::npos)
  {
    cout << "The coding region ends at position " << uag_pos + 1 << endl;
  }
  else
     ...
}
else
  cout << "Could not find the codon region..." << endl;


This is not correct:
1
2
3
 for (int i = 0; i < len; i++) // to make below valid use len-2
 {
	if(start[i] == 'A' && start[i+1] == 'U' && start[i+2] == 'G')
since i+1/i+2 may be out of bounds
closed account (GybDjE8b)
i dont understand y it has to be len-2? wouldnt that subtract the length by 2 making it shorter?

and let say if we use your
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void codon (string start)
{
	int tryptophan = 0;
	string::size_type aug_pos = start.find("AUG");
if(aug_pos != string::npos)
{
  cout << "The coding region starts at position " << aug_pos + 1 << endl;
  string::size_type ugg_pos = start.find("UGG", ugg_pos + 3);
  string::size_type uag_pos = start.find("UAG", aug_pos + 3);
  if(ugg_pos != string::npos)
  {
    tryptophan++;
	cout << tryptophan << endl;
  }
  else if(aug_pos != string::npos)
	  cout << "The coding region ends at position " << uag_pos + 1 << endl;
}
else
  cout << "Could not find the codon region..." << endl;
}


im getting an error at ugg_pos :S can you also explain to me wat npos is?
i dont understand y it has to be len-2? wouldnt that subtract the length by 2 making it shorter?


From line 7 of your code:
 
if(start[i] == 'A' && start[i+1] == 'U' && start[i+2] == 'G')

What happens when len == 10 and i == 9? start[i+1] and start[i+2] reference elements that are out of bounds.

can you also explain to me wat npos is?

http://www.cplusplus.com/reference/string/string/npos/

As a return value, it is usually used to indicate no matches.
Last edited on
closed account (GybDjE8b)
o okay can u explain y im getting a bug with
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void codon (string start)
{
	int tryptophan = 0;
	string::size_type aug_pos = start.find("AUG");
if(aug_pos != string::npos)
{
  cout << "The coding region starts at position " << aug_pos + 1 << endl;
  string::size_type ugg_pos = start.find("UGG", ugg_pos + 3);
  string::size_type uag_pos = start.find("UAG", aug_pos + 3);
  if(ugg_pos != string::npos)
  {
    tryptophan++;
	cout << tryptophan << endl;
  }
  else if(aug_pos != string::npos)
	  cout << "The coding region ends at position " << uag_pos + 1 << endl;
}
else
  cout << "Could not find the codon region..." << endl;
}
Line 8: What's the value of ugg_pos when find is called?
Topic archived. No new replies allowed.