Selective intra-dinucleotide interactions and periodicities of bases separated by K sites: a new vision and tool for phylogeny analyses
Biol. Res; 50 (), 2017
Publication year: 2017
Direct tests of the random or non-random distribution of nucleotides on genomes have been devised to test the hypothesis of neutral, nearly-neutral or selective evolution. These tests are based on the direct base distribution and are independent of the functional (coding or non-coding) or structural (repeated or unique sequences) properties of the DNA. The first approach described the longitudinal distribution of bases in tandem repeats under the Bose-Einstein statistics. A huge deviation from randomness was found. A second approach was the study of the base distribution within dinucleotides whose bases were separated by 0, 1, 2... K nucleotides. Again an enormous difference from the random distribution was found with significances out of tables and programs. These test values were periodical and included the 16 dinucleotides. For example a high ¨positive¨ (more observed than expected dinucleotides) value, found in dinucleotides whose bases were separated by (3K + 2) sites, was preceded by two smaller ¨negative¨ (less observed than expected dinucleotides) values, whose bases were separated by (3K) or (3K + 1) sites. We examined mtDNAs, prokaryote genomes and some eukaryote chromosomes and found that the significant non-random interactions and periodicities were present up to 1000 or more sites of base separation and in human chromosome 21 until separations of more than 10 millions sites. Each nucleotide has its own significant value of its distance to neutrality; this yields 16 hierarchical significances. A three dimensional table with the number of sites of separation between the bases and the 16 significances (the third dimension is the dinucleotide, individual or taxon involved) gives directly an evolutionary state of the analyzed genome that can be used to obtain phylogenies. An example is provided.
Secuencia de Bases/genética, Genoma, Nucleótidos/genética, Filogenia, Análisis de Secuencia de ADN/métodos, Colágeno/genética, ADN Mitocondrial/genética, Drosophila melanogaster/genética, Epistasis Genética/genética, VIH-1/genética, Nucleótidos/química, Células Procariotas/química, Algoritmos, Distribución de Chi-Cuadrado, Estructuras Cromosómicas, Evolución Molecular, Flujo Genético, Periodicidad, Valores de Referencia, Secuencias Repetidas en Tándem