5,291 research outputs found

    Biological sequences as pictures – a generic two dimensional solution for iterated maps

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Representing symbolic sequences graphically using iterated maps has enjoyed an enduring popularity since it was first proposed in Jeffrey 1990 as chaos game representation (CGR). The usefulness of this representation goes beyond the convenience of a scale independent representation. It provides a variable memory length representation of transition. This includes the representation of succession with non-integer order, which comes with the promise of generalizing Markovian formalisms. The original proposal targeted genomic sequences only but since then several generalizations have been proposed, many specifically designed to handle protein data.</p> <p>Results</p> <p>The challenge of a general solution is that of deriving a bijective transformation of symbolic sequences into bi-dimensional planes. More specifically, it requires the regular fractal nesting of polygons. A first attempt at a general solution was proposed by Fiser 1994 by using non-overlapping circles that contain the polygons. This was used as a starting point to identify a more efficient solution where the encapsulating circles can overlap without the same happening for the sequence maps which are circumscribed to fractal polygon domains.</p> <p>Conclusion</p> <p>We identified the optimal inscribed packing solution for iterated maps of any Biological sequence, indeed of any symbolic sequence. The new solution maintains the prized bijective mapping property and includes the Sierpinski triangle and the CGR square as particular solutions of the more encompassing formulation.</p

    Mitochondrial DNA Profiling by Fractal Lacunarity to Characterize the Senescent Phenotype as Normal Aging or Pathological Aging

    Get PDF
    Biocomplexity, chaos, and fractality can explain the heterogeneity of aging individuals by regarding longevity as a "secondary product" of the evolution of a dynamic nonlinear system. Genetic-environmental interactions drive the individual senescent phenotype toward normal, pathological, or successful aging. Mitochondrial dysfunctions and mitochondrial DNA (mtDNA) mutations represent a possible mechanism shared by disease(s) and the aging process. This study aims to characterize the senescent phenotype and discriminate between normal (nA) and pathological (pA) aging by mtDNA mutation profiling. MtDNA sequences from hospitalized and non-hospitalized subjects (age-range: 65-89 years) were analyzed and compared to the revised Cambridge Reference Sequence (rCRS). Fractal properties of mtDNA sequences were displayed by chaos game representation (CGR) method, previously modified to deal with heteroplasmy. Fractal lacunarity analysis was applied to characterize the senescent phenotype on the basis of mtDNA sequence mutations. Lacunarity parameter beta, from our hyperbola model function, was statistically different (p &lt; 0.01) between the nA and pA groups. Parameter beta cut-off value at 1.26 x 10(-3) identifies 78% nA and 80% pA subjects. This also agrees with the presence of MT-CO gene variants, peculiar to nA (C9546m, 83%) and pA (T9900w, 80%) mtDNA, respectively. Fractal lacunarity can discriminate the senescent phenotype evolving as normal or pathological aging by individual mtDNA mutation profile

    A quantitative method for measuring and visualizing species\u27 relatedness in a two-dimensional Euclidean space.

    Get PDF
    Representing DNA sequences graphically and evaluating, as well as displaying, species’ relationships have been considered to be an important aspect of molecular biology research. A novel approach is proposed in this thesis that combines three methods: a) Chaos Game Representation (CGR), to portray quantitative characteristics of a DNA sequence as a black-and -white image, b) Structural Similarity (SSIM) index, an image comparison method, to compute pair-wise distances between these images, and c) Multidimensional Scaling (MDS), to visually display each sequence as a point in a two-dimensional Euclidean space. The proposed method produces a visual representation called Genome Distance Map (GDM) when applied to a collection of genomic DNA sequences. In a resulting Genome Distance Map, the sequences can be visualized as points in a common two-dimensional Euclidean space, wherein the geometric distance between any two points is approximate to the differences between their respective DNA sequence compositions. In addition, the proposed Genome Distance Map provides a compelling visualization of species’ relatedness in comparison to the phylogenetic trees. Moreover, the proposed method is sensitive and robust in detecting insertions, deletions, substitutions of nucleotides in a genome

    Biocomplexity and Fractality in the Search of Biomarkers of Aging and Pathology: Focus on Mitochondrial DNA and Alzheimer's Disease

    Get PDF
    Alzheimer's disease (AD) represents one major health concern for our growing elderly population. It accounts for increasing impairment of cognitive capacity followed by loss of executive function in late stage. AD pathogenesis is multifaceted and difficult to pinpoint, and understanding AD etiology will be critical to effectively diagnose and treat the disease. An interesting hypothesis concerning AD development postulates a cause-effect relationship between accumulation of mitochondrial DNA (mtDNA) mutations and neurodegenerative changes associated with this pathology. Here we propose a computerized method for an easy and fast mtDNA mutations-based characterization of AD. The method has been built taking into account the complexity of living being and fractal properties of many anatomic and physiologic structures, including mtDNA. Dealing with mtDNA mutations as gaps in the nucleotide sequence, fractal lacunarity appears a suitable tool to differentiate between aging and AD. Therefore, Chaos Game Representation method has been used to display DNA fractal properties after adapting the algorithm to visualize also heteroplasmic mutations. Parameter β from our fractal lacunarity method, based on hyperbola model function, has been measured to quantitatively characterize AD on the basis of mtDNA mutations. Results from this pilot study to develop the method show that fractal lacunarity parameter β of mtDNA is statistically different in AD patients when compared to age-matched controls. Fractal lacunarity analysis represents a useful tool to analyze mtDNA mutations. Lacunarity parameter β is able to characterize individual mutation profile of mitochondrial genome and appears a promising index to discriminate between AD and aging

    Molecular Distance Maps: An alignment-free computational tool for analyzing and visualizing DNA sequences\u27 interrelationships

    Get PDF
    In an attempt to identify and classify species based on genetic evidence, we propose a novel combination of methods to quantify and visualize the interrelationships between thousand of species. This is possible by using Chaos Game Representation (CGR) of DNA sequences to compute genomic signatures which we then compare by computing pairwise distances. In the last step, the original DNA sequences are embedded in a high dimensional space using Multi-Dimensional Scaling (MDS) before everything is projected on a Euclidean 3D space. To start with, we apply this method to a mitochondrial DNA dataset from NCBI containing over 3,000 species. The analysis shows that the oligomer composition of full mtDNA sequences can be a source of taxonomic information, suggesting that this method could be used for unclassified species and taxonomic controversies. Next, we test the hypothesis that CGR-based genomic signature is preserved along a species\u27 genome by comparing inter- and intra-genomic signatures of nuclear DNA sequences from six different organisms, one from each kingdom of life. We also compare six different distances and we assess their performance using statistical measures. Our results support the existence of a genomic signature for a species\u27 genome at the kingdom level. In addition, we test whether CGR-based genomic signatures originating only from nuclear DNA can be used to distinguish between closely-related species and we answer in the negative. To overcome this limitation, we propose the concept of ``composite signatures\u27\u27 which combine information from different types of DNA and we show that they can effectively distinguish all closely-related species under consideration. We also propose the concept of ``assembled signatures\u27\u27 which, among other advantages, do not require a long contiguous DNA sequence but can be built from smaller ones consisting of ~100-300 base pairs. Finally, we design an interactive webtool MoDMaps3D for building three-dimensional Molecular Distance Maps. The user can explore an already existing map or build his/her own using NCBI\u27s accession numbers as input. MoDMaps3D is platform independent, written in Javascript and can run in all major modern browsers

    Computing distribution of scale independent motifs in biological sequences

    Get PDF
    The use of Chaos Game Representation (CGR) or its generalization, Universal Sequence Maps (USM), to describe the distribution of biological sequences has been found objectionable because of the fractal structure of that coordinate system. Consequently, the investigation of distribution of symbolic motifs at multiple scales is hampered by an inexact association between distance and sequence dissimilarity. A solution to this problem could unleash the use of iterative maps as phase-state representation of sequences where its statistical properties can be conveniently investigated. In this study a family of kernel density functions is described that accommodates the fractal nature of iterative function representations of symbolic sequences and, consequently, enables the exact investigation of sequence motifs of arbitrary lengths in that scale-independent representation. Furthermore, the proposed kernel density includes both Markovian succession and currently used alignment-free sequence dissimilarity metrics as special solutions. Therefore, the fractal kernel described is in fact a generalization that provides a common framework for a diverse suite of sequence analysis techniques

    Local Renyi entropic profiles of DNA sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs.</p> <p>Results</p> <p>The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at <url>http://kdbio.inesc-id.pt/~svinga/ep/</url>.</p> <p>Conclusion</p> <p>The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.</p

    Accessing complexity from genome information

    Get PDF
    This paper studies the information content of the chromosomes of 24 species. In a first phase, a scheme inspired in dynamical system state space representation is developed. For each chromosome the state space dynamical evolution is shed into a two dimensional chart. The plots are then analyzed and characterized in the perspective of fractal dimension. This information is integrated in two measures of the species’ complexity addressing its average and variability. The results are in close accordance with phylogenetics pointing quantitative aspects of the species’ genomic complexity
    • …