25 research outputs found

    10231 Abstracts Collection -- Structure Discovery in Biology: Motifs, Networks & Phylogenies

    Get PDF
    From 06.06. to 11.06.2010, the Dagstuhl Seminar 10231 ``Structure Discovery in Biology: Motifs, Networks & Phylogenies \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Phylogenetic Detection of Recombination with a Bayesian Prior on the Distance between Trees

    Get PDF
    Genomic regions participating in recombination events may support distinct topologies, and phylogenetic analyses should incorporate this heterogeneity. Existing phylogenetic methods for recombination detection are challenged by the enormous number of possible topologies, even for a moderate number of taxa. If, however, the detection analysis is conducted independently between each putative recombinant sequence and a set of reference parentals, potential recombinations between the recombinants are neglected. In this context, a recombination hotspot can be inferred in phylogenetic analyses if we observe several consecutive breakpoints. We developed a distance measure between unrooted topologies that closely resembles the number of recombinations. By introducing a prior distribution on these recombination distances, a Bayesian hierarchical model was devised to detect phylogenetic inconsistencies occurring due to recombinations. This model relaxes the assumption of known parental sequences, still common in HIV analysis, allowing the entire dataset to be analyzed at once. On simulated datasets with up to 16 taxa, our method correctly detected recombination breakpoints and the number of recombination events for each breakpoint. The procedure is robust to rate and transition∶transversion heterogeneities for simulations with and without recombination. This recombination distance is related to recombination hotspots. Applying this procedure to a genomic HIV-1 dataset, we found evidence for hotspots and de novo recombination

    The era of the ARG: an empiricist's guide to ancestral recombination graphs

    Full text link
    In the presence of recombination, the evolutionary relationships between a set of sampled genomes cannot be described by a single genealogical tree. Instead, the genomes are related by a complex, interwoven collection of genealogies formalized in a structure called an ancestral recombination graph (ARG). An ARG extensively encodes the ancestry of the genome(s) and thus is replete with valuable information for addressing diverse questions in evolutionary biology. Despite its potential utility, technological and methodological limitations, along with a lack of approachable literature, have severely restricted awareness and application of ARGs in empirical evolution research. Excitingly, recent progress in ARG reconstruction and simulation have made ARG-based approaches feasible for many questions and systems. In this review, we provide an accessible introduction and exploration of ARGs, survey recent methodological breakthroughs, and describe the potential for ARGs to further existing goals and open avenues of inquiry that were previously inaccessible in evolutionary genomics. Through this discussion, we aim to more widely disseminate the promise of ARGs in evolutionary genomics and encourage the broader development and adoption of ARG-based inference.Comment: 34 pages, 3 figures, 3 table

    Genome reconstruction and combinatoric analyses of rearrangement evolution

    Get PDF
    Cancer is often associated with a high number of large-scale, structural rearrangements. In a highly selective environment, some `driver' mutations conferring clonal growth advantage will be positively selected, accounting for further cancer development. Clarifying their nature, as well as their contribution to the pathology is a major current focus of biomedical research. Next generation sequencing technologies can be used nowadays to generate high-resolution data-sets of these alterations in cancer genomes. This project has been developed along two main lines: 1) the reconstruction of cancer aberrant karyotypes, together with their underlying evolutionary history; 2) the elucidation of some combinatorial properties associated with gene duplications. We applied graph theory to the problem of reconstructing the final cancer genome sequence; additionally, we developed an algorithmic approach for the reconstruction of a multi-step evolution consistent with read coverage and paired end data, giving insights on the possible molecular mechanisms underlying rearrangements. Looking at the combinatorics of both tandem and inverted duplication, we developed an algebraic formalism for the representation of these processes. This allowed us to both explore the geometric properties of sequences arising by Tandem Duplication (TD), and obtain a recursion for the number of tandem duplications evolutions after n events. Such results are missing for inverted duplications, whose combinatorial properties have been nevertheless deeply elucidated. Our results have allowed: 1) the identification, through an original approach, of potential rearrangement mechanisms associated with cancer development, and 2) the definition and mathematical description of the complete evolutionary space of specific rearrangement classes

    RNA, the Epicenter of Genetic Information

    Get PDF
    The origin story and emergence of molecular biology is muddled. The early triumphs in bacterial genetics and the complexity of animal and plant genomes complicate an intricate history. This book documents the many advances, as well as the prejudices and founder fallacies. It highlights the premature relegation of RNA to simply an intermediate between gene and protein, the underestimation of the amount of information required to program the development of multicellular organisms, and the dawning realization that RNA is the cornerstone of cell biology, development, brain function and probably evolution itself. Key personalities, their hubris as well as prescient predictions are richly illustrated with quotes, archival material, photographs, diagrams and references to bring the people, ideas and discoveries to life, from the conceptual cradles of molecular biology to the current revolution in the understanding of genetic information. Key Features Documents the confused early history of DNA, RNA and proteins - a transformative history of molecular biology like no other. Integrates the influences of biochemistry and genetics on the landscape of molecular biology. Chronicles the important discoveries, preconceptions and misconceptions that retarded or misdirected progress. Highlights major pioneers and contributors to molecular biology, with a focus on RNA and noncoding DNA. Summarizes the mounting evidence for the central roles of non-protein-coding RNA in cell and developmental biology. Provides a thought-provoking retrospective and forward-looking perspective for advanced students and professional researchers

    Differential evolution of non-coding DNA across eukaryotes and its close relationship with complex multicellularity on Earth

    Get PDF
    Here, I elaborate on the hypothesis that complex multicellularity (CM, sensu Knoll) is a major evolutionary transition (sensu Szathmary), which has convergently evolved a few times in Eukarya only: within red and brown algae, plants, animals, and fungi. Paradoxically, CM seems to correlate with the expansion of non-coding DNA (ncDNA) in the genome rather than with genome size or the total number of genes. Thus, I investigated the correlation between genome and organismal complexities across 461 eukaryotes under a phylogenetically controlled framework. To that end, I introduce the first formal definitions and criteria to distinguish ‘unicellularity’, ‘simple’ (SM) and ‘complex’ multicellularity. Rather than using the limited available estimations of unique cell types, the 461 species were classified according to our criteria by reviewing their life cycle and body plan development from literature. Then, I investigated the evolutionary association between genome size and 35 genome-wide features (introns and exons from protein-coding genes, repeats and intergenic regions) describing the coding and ncDNA complexities of the 461 genomes. To that end, I developed ‘GenomeContent’, a program that systematically retrieves massive multidimensional datasets from gene annotations and calculates over 100 genome-wide statistics. R-scripts coupled to parallel computing were created to calculate >260,000 phylogenetic controlled pairwise correlations. As previously reported, both repetitive and non-repetitive DNA are found to be scaling strongly and positively with genome size across most eukaryotic lineages. Contrasting previous studies, I demonstrate that changes in the length and repeat composition of introns are only weakly or moderately associated with changes in genome size at the global phylogenetic scale, while changes in intron abundance (within and across genes) are either not or only very weakly associated with changes in genome size. Our evolutionary correlations are robust to: different phylogenetic regression methods, uncertainties in the tree of eukaryotes, variations in genome size estimates, and randomly reduced datasets. Then, I investigated the correlation between the 35 genome-wide features and the cellular complexity of the 461 eukaryotes with phylogenetic Principal Component Analyses. Our results endorse a genetic distinction between SM and CM in Archaeplastida and Metazoa, but not so clearly in Fungi. Remarkably, complex multicellular organisms and their closest ancestral relatives are characterized by high intron-richness, regardless of genome size. Finally, I argue why and how a vast expansion of non-coding RNA (ncRNA) regulators rather than of novel protein regulators can promote the emergence of CM in Eukarya. As a proof of concept, I co-developed a novel ‘ceRNA-motif pipeline’ for the prediction of “competing endogenous” ncRNAs (ceRNAs) that regulate microRNAs in plants. We identified three candidate ceRNAs motifs: MIM166, MIM171 and MIM159/319, which were found to be conserved across land plants and be potentially involved in diverse developmental processes and stress responses. Collectively, the findings of this dissertation support our hypothesis that CM on Earth is a major evolutionary transition promoted by the expansion of two major ncDNA classes, introns and regulatory ncRNAs, which might have boosted the irreversible commitment of cell types in certain lineages by canalizing the timing and kinetics of the eukaryotic transcriptome.:Cover page Abstract Acknowledgements Index 1. The structure of this thesis 1.1. Structure of this PhD dissertation 1.2. Publications of this PhD dissertation 1.3. Computational infrastructure and resources 1.4. Disclosure of financial support and information use 1.5. Acknowledgements 1.6. Author contributions and use of impersonal and personal pronouns 2. Biological background 2.1. The complexity of the eukaryotic genome 2.2. The problem of counting and defining “genes” in eukaryotes 2.3. The “function” concept for genes and “dark matter” 2.4. Increases of organismal complexity on Earth through multicellularity 2.5. Multicellularity is a “fitness transition” in individuality 2.6. The complexity of cell differentiation in multicellularity 3. Technical background 3.1. The Phylogenetic Comparative Method (PCM) 3.2. RNA secondary structure prediction 3.3. Some standards for genome and gene annotation 4. What is in a eukaryotic genome? GenomeContent provides a good answer 4.1. Background 4.2. Motivation: an interoperable tool for data retrieval of gene annotations 4.3. Methods 4.4. Results 4.5. Discussion 5. The evolutionary correlation between genome size and ncDNA 5.1. Background 5.2. Motivation: estimating the relationship between genome size and ncDNA 5.3. Methods 5.4. Results 5.5. Discussion 6. The relationship between non-coding DNA and Complex Multicellularity 6.1. Background 6.2. Motivation: How to define and measure complex multicellularity across eukaryotes? 6.3. Methods 6.4. Results 6.5. Discussion 7. The ceRNA motif pipeline: regulation of microRNAs by target mimics 7.1. Background 7.2. A revisited protocol for the computational analysis of Target Mimics 7.3. Motivation: a novel pipeline for ceRNA motif discovery 7.4. Methods 7.5. Results 7.6. Discussion 8. Conclusions and outlook 8.1. Contributions and lessons for the bioinformatics of large-scale comparative analyses 8.2. Intron features are evolutionarily decoupled among themselves and from genome size throughout Eukarya 8.3. “Complex multicellularity” is a major evolutionary transition 8.4. Role of RNA throughout the evolution of life and complex multicellularity on Earth 9. Supplementary Data Bibliography Curriculum Scientiae Selbständigkeitserklärung (declaration of authorship

    Faculty Publications and Creative Works 2003

    Get PDF
    Faculty Publications & Creative Works is an annual compendium of scholarly and creative activities of University of New Mexico faculty during the noted calendar year. It serves to illustrate the robust and active intellectual pursuits conducted by the faculty in support of teaching and research at UNM

    Faculty Publications and Creative Works 2001

    Get PDF
    One of the ways in which we recognize our faculty at the University of New Mexico is through Faculty Publications & Creative Works. An annual publication, it highlights our faculty\u27s scholarly and creative activities and achievements and serves as a compendium of UNM faculty efforts during the 2001 calendar year. Faculty Publications & Creative Works strives to illustrate the depth and breadth of research activities performed throughout our University\u27s laboratories, studios and classrooms. We believe that the communication of individual research is a significant method of sharing concepts and thoughts and ultimately inspiring the birth of new ideas. In support of this, UNM faculty during 2001 produced over 2,299* works, including 1,685 scholarly papers and articles, 69 books, 269 book chapters, 184 reviews, 86 creative works and 6 patented works. We are proud of the accomplishments of our faculty which are in part reflected in this book, which illustrates the diversity of intellectual pursuits in support of research and education at the University of New Mexico
    corecore