@sargon94
Each triplet of bases (codon) is used to refer to an amino acid for protein synthesis. It's quite complicated. DNA is double stranded. To synthesise a protein, first the DNA is unzipped (the strands come apart at the start of the gene that describes the protein). One of the two strands is called a "template strand" and the other is a "coding strand": the coding strand actually describes the structure of the protein (the sequence of amino acids, represented by codons (triplets of nucleotides)) and the template strand is simply the opposite of the coding strand. The enzyme RNA polymerase reads the template strand (starting at a START codon) and creates a"messenger" RNA (mRNA) strand which is opposite to it (ending at a STOP codon), and therefore equivalent to the coding strand (it's not quite identical because it's RNA instead of DNA: RNA has to be used because DNA is too large to leave the nucleus of a cell; the main difference is that RNA uses a molecule called uracil (U) instead of thymine (T)). The mRNA strand leaves the nucleus of the cell and binds to something called a ribosome which reads the mRNA strand. For each triplet of bases on the mRNA, it finds a tRNA (transfer RNA) molecule which is complementary (has opposite nucleotides). The tRNA comes bound to an amino acid, for example, leucine binds with AAU and is therefore represented by UUA in mRNA (or TTU in DNA). The amino acid forms a peptide bond with the previous amino acid (if there is one), and then the tRNA of the previous amino acid unbinds. Eventually you have a chain of amino acids (a peptide chain or polypeptide, also known as primary structure). After the ribosome is finished, the chain folds up (called secondary structure in two dimensions or tertiary structure in three) forming a protein. Some proteins, such as collagen, are formed from multiple polypeptide chains.
NB: The terms "nucleotide" and "base" are used interchangeably; all nucleotides are bases although not all bases are nucleotides.
How do you think I could go about chromosome swapping? From what I understand all of them have their own DNA inside them, so I'de be swapping only part of it out for the other |
In real life, chromosomes have a shape a bit like an X. One of the ways they cross over is by coiling their legs around each other. I don't know what happens next but they basically swap genes. In genetic algorithms it's a lot simpler. One way to do it is to pick a single point on the chromosome and swap every bit after that point with the other chromosome. You should probably do it that way.
23 chromosomes with say... 64 base pairs each |
In biology, a chromosome has about 0.1-3.75 Mbp (mega (i.e. million) base pairs). At most, you could fit four base pairs in a byte (since they need two bits each) so that would be at least 0.25 * 4 Mbp * 23 = ~21.9 MiB for a human. So if you want to simulate lots of people you probably should simplify it a little if you want to run on something other than a supercomputer. Or you could use a simple bacteria instead, since some of them have a genome of only about 30 kiB.
Here's something interesting (maybe): since gametes are haploid cells, they have about 11 MiB of data and sex therefore represents a transfer of 11 MiB * 200,000,000 = ~2 petaBytes of data for the sake of 11 MiB, which is an efficiency of about 0.000 000 5%. That's like uploading your entire hard drive a million times just to send someone a single JPG image.