2 research outputs found

    Additional file 3: of A method for identification of highly conserved elements and evolutionary analysis of superphylum Alveolata

    No full text
    Presents summary data on the clusters. The Resume sheet provides some information on the algorithm results with different parameters. The variant with the threshold length 65 bp, which is discussed in the main paper, is highlighted in pink. In lines 15–42, the number of m-dense clusters and their vertices are shown for m values from 30 to 3. For example (line 15): 14 clusters were found containing words from all 30 genomes; these clusters comprise a total of 3736 vertices, i.e. 9 words per genome on average. Another example (line 35): 84 clusters were found containing words from 10 genomes; 1.2 words per genome on average. On the Clusters sheet, each line starting from the sixth one corresponds to a cluster. Column A contains the cluster number highlighted in the case of untranslated or unknown UCE (similar to Additional file 2). The HCE type is shown in column B as follows: if any of cluster words was found in Rfam, the cluster corresponds to a known RNA such as tRNA, snRNA, etc.; the column contains this RNA label; if any of cluster words overlaps with a CDS, it corresponds to a protein (exon) and is labeled as a protein; if any of cluster words overlaps with a gene, it corresponds to an intron or other untranslated region and is labeled as an intron; otherwise the cluster describes an unknown HCE (no label in column B). Column C shows the total number of words in the cluster; column D, the total number of species containing these words; and columns E–AH, the number of words from each species. (XLSX 1109 kb
    corecore