These tools will no longer be maintained as of December 31, 2024. Archived website can be found here. PubMed4Hh GitHub repository can be found here. Contact NLM Customer Service if you have questions.


PUBMED FOR HANDHELDS

Search MEDLINE/PubMed


  • Title: An evolutionary analytical model of a complementary circular code simulating the protein coding genes, the 5' and 3' regions.
    Author: Arquès DG, Fallot JP, Michel CJ.
    Journal: Bull Math Biol; 1998 Jan; 60(1):163-94. PubMed ID: 9530018.
    Abstract:
    The self-complementary subset T0 = X0 [symbol: see text] ¿AAA, TTT¿ with X0 = ¿AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC¿ of 22 trinucleotides has a preferential occurrence in the frame 0 (reading frame established by the ATG start trinucleotide) of protein (coding) genes of both prokaryotes and eukaryotes. The subsets T1 = X1 [symbol: see text] ¿CCC¿ and T2 = X2 [symbol: see text] ¿GGG¿ of 21 trinucleotides have a preferential occurrence in the shifted frames 1 and 2 respectively (frame 0 shifted by one and two nucleotides respectively in the 5'-3' direction). T1 and T2 are complementary to each other. The subset T0 contains the subset X0 which has the rarity property (6 x 10(-8) to be a complementary maximal circular code with two permutated maximal circular codes X1 and X2 in the frames 1 and 2 respectively. X0 is called a C3 code. A quantitative study of these three subsets T0, T1, T2 in the three frames 0, 1, 2 of protein genes, and the 5' and 3' regions of eukaryotes, shows that their occurrence frequencies are constant functions of the trinucleotide positions in the sequences. The frequencies of T0, T1, T2 in the frame 0 of protein genes are 49, 28.5 and 22.5% respectively. In contrast, the frequencies of T0, T1, T2 in the 5' and 3' regions of eukaryotes, are independent of the frame. Indeed, the frequency of T0 in the three frames of 5' (respectively 3') regions is equal to 35.5% (respectively 38%) and is greater than the frequencies T1 and T2, both equal to 32.25% (respectively 31%) in the three frames. Several frequency asymmetries unexpectedly observed (e.g. the frequency difference between T1 and T2 in the frame 0), are related to a new property of the subset T0 involving substitutions. An evolutionary analytical model at three parameters (p, q, t) based on an independent mixing of the 22 codons (trinucleotides in frame 0) of T0 with equiprobability (1/22) followed by t approximately 4 substitutions per codon according to the proportions p approximately 0.1, q approximately 0.1 and r = 1 - p - q approximately 0.8 in the three codon sites respectively, retrieves the frequencies of T0, T1, T2 observed in the three frames of protein genes and explains these asymmetries. Furthermore, the same model (0.1, 0.1, t) after t approximately 22 substitutions per codon, retrieves the statistical properties observed in the three frames of the 5' and 3' regions. The complex behaviour of these analytical curves is totally unexpected and a priori difficult to imagine.
    [Abstract] [Full Text] [Related] [New Search]