6 research outputs found

    Numerical Characterization of DNA Sequence Based on Dinucleotides

    Get PDF
    Sequence comparison is a primary technique for the analysis of DNA sequences. In order to make quantitative comparisons, one devises mathematical descriptors that capture the essence of the base composition and distribution of the sequence. Alignment methods and graphical techniques (where each sequence is represented by a curve in high-dimension Euclidean space) have been used popularly for a long time. In this contribution we will introduce a new nongraphical and nonalignment approach based on the frequencies of the dinucleotide XY in DNA sequences. The most important feature of this method is that it not only identifies adjacent XY pairs but also nonadjacent XY ones where X and Y are separated by some number of nucleotides. This methodology preserves information in DNA sequence that is ignored by other methods. We test our method on the coding regions of exon-1 of β–globin for 11 species, and the utility of this new method is demonstrated

    Mitochondrial DNA for molecular taxonomy

    Get PDF
    Tato bakalářská práce se zabývá molekulární taxonomií pomocí mitochondriální DNA. V úvodu je popsána taxonomie obecná a molekulární. V teoretické části je dále popsána struktura a složení živočišné buňky, deoxyribonukleové kyseliny a mitochondriální deoxyribonukleové kyseliny. Další část obsahuje informace o DNA barcodingu a numerických metodách reprezentace genomických sekvencí. V praktické části je popsán program zvolené numerické metody pro zpracování genomických sekvencí a programy pro tvorbu a přiřazování sekvencí referenčním druhům.This work deals with mitochondrial DNA and molecular taxonomy. Structure and composition of animal cell, deoxyribonucleic acids and mitochondrial ribonucleic acids are described in the introduction. Another part contains information of DNA barcoding and numerical representation of genomic sequences. Programs are described in the practical part.

    Classification of organisms using nucleotides frequencies

    Get PDF
    Tato práce se snaží představit různé přístupy analýzy genomických dat a klasifikace organismů. Chce porovnat účinnost klasických metod, založených na nutnosti vzájemného zarovnání sekvencí, které jsou tímto výpočetně náročnější s moderními přístupy, využívajícími pouze četnosti jednotlivých nukleotidů či jejich skupin v biologických sekvencích.This thesis tries to present different approaches of analysis of genomic data and classification of organisms. This thesis also wants to compare the effectiveness of traditional methods based on the necessity of aligning sequences that are computationally demanding and modern approaches utilizing only the frequencies of individual nucleotides or groups of them in biological sequences.

    On the characterization of DNA primary sequences by triplet of nucleic acid bases

    No full text
    We consider construction of a set of smaller 4 x 4 matrices to represent DNA primary sequences which are based on enumeration of all 64 triplets of nucleic acids bases. The leading eigenvalue from the constructed matrices has been selected as an invariant for construction of a vector to characterize DNA. Additional invariants considered of the derived condensed matrices of DNA include a 64-component vector, the components of which consist of ordered triplets XYZ, with X, Y, Z = A, C, G, T. Construction of similarity/dissimilarity tables based on different invariants for a set of sequences of DNA belonging to the first exon of the beta -globin gene of eight species illustrates the utility of newly formulated invariants for DNA