279 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
PLACING THE EVOLUTIONARY HISTORY OF \u3ci\u3eDESMOGNATHUS\u3c/i\u3e SALAMANDERS IN CONTEXT: A PHYLOGEOGRAPHIC APPROACH
Patterns of genetic variation do not arise in a vacuum but are instead shaped by the interplay between evolutionary forces and ecological constraints. Here, I use a phylogeographic approach to examine the role that ecology played in lineage divergence in the Desmognathus quadramaculatus species complex (Family: Plethodontidae), which consists of three nominal species: D. quadramaculatus, D. marmoratus, and D. folkertsi. Previous phylogenetic studies have shown that individuals from these species do not form clades based on phenotype. My approach to reconciling phylogenetic discordance was two-fold, using (1) genome-wide markers to provide insight into the relationships among lineages and (2) geographic and climate data to provide context for patterns of genetic diversity.
First, I obtained genome-wide nuclear markers using double-digest restriction-site associated DNA sequencing (ddRAD) to examine whether two morphologically divergent species, D. marmoratus and D. quadramaculatus, represent independently evolving lineages. Phylogenetic, population structure, and model testing analyses all confirmed that D. marmoratus and D. quadramaculatus do not group based on phenotype. Instead, I found that there were two cryptic genetic lineages (Nantahala and Pisgah) that each contained both phenotypes. Additionally, ecological niche modeling showed that the two genetic lineages primarily occupy geographic areas with significantly different climates, suggesting that climate may have played a role in divergence.
Next, I assembled loci from publicly available sequencing data using a draft transcriptome of Desmognathus fuscus as a reference to assess the three nominal species in the quadramaculatus species complex across their entire range. I used phylogenetic and population structure analyses, alongside haplowebs and conspecificity matrices, to determine if the loci supported the hypothesis that the phenotypes represent multiple independently evolving lineages within the broader genetic clades found in the previous chapter. I found that the loci were not informative enough to determine whether the phenotypes had a genetic basis in Pisgah, but did support genetic divergence between phenotypes in Nantahala.
Finally, I used ecological niche models (ENMs) and resistance modeling to place the genetic results and phenotypic diversity within the context of time and space. I found that though the quadramaculatus and marmoratus phenotypes were nearly indistinguishable in niche space in the present day, they were projected to occupy different geographic areas in the past and future. The southern portion of the study area had areas of high habitat suitability from the Last Glacial Maximum (~22 kya) to the present, which aligns with the higher genetic divergence between groups in Nantahala. Anthropogenic land use changes reduced habitat availability but likely did not drive genetic divergence in the past, and may be of more consequence to genetic diversity than climate change over the next 50 years.
Like many taxa that underwent adaptive radiations, the evolutionary history of Desmognathus has been obfuscated by high rates of within-species phenotypic diversity and shared morphology between distantly related lineages. My findings emphasize the importance of interrogating complex patterns of genetic variation within the context of the dynamic, heterogeneous landscapes in which they arise
A critical analysis of the current state of virus taxonomy
Taxonomical classification has preceded evolutionary understanding. For that reason, taxonomy has become a battleground fueled by knowledge gaps, technical limitations, and a priorism. Here we assess the current state of the challenging field, focusing on fallacies that are common in viral classification. We emphasize that viruses are crucial contributors to the genomic and functional makeup of holobionts, organismal communities that behave as units of biological organization. Consequently, viruses cannot be considered taxonomic units because they challenge crucial concepts of organismality and individuality. Instead, they should be considered processes that integrate virions and their hosts into life cycles. Viruses harbor phylogenetic signatures of genetic transfer that compromise monophyly and the validity of deep taxonomic ranks. A focus on building phylogenetic networks using alignment-free methodologies and molecular structure can help mitigate the impasse, at least in part. Finally, structural phylogenomic analysis challenges the polyphyletic scenario of multiple viral origins adopted by virus taxonomy, defeating a polyphyletic origin and supporting instead an ancient cellular origin of viruses. We therefore, prompt abandoning deep ranks and urgently reevaluating the validity of taxonomic units and principles of virus classification
Differential evolution of non-coding DNA across eukaryotes and its close relationship with complex multicellularity on Earth
Here, I elaborate on the hypothesis that complex multicellularity (CM, sensu Knoll) is a major evolutionary transition (sensu Szathmary), which has convergently evolved a few times in Eukarya only: within red and brown algae, plants, animals, and fungi. Paradoxically, CM seems to correlate with the expansion of non-coding DNA (ncDNA) in the genome rather than with genome size or the total number of genes. Thus, I investigated the correlation between genome and organismal complexities across 461 eukaryotes under a phylogenetically controlled framework. To that end, I introduce the first formal definitions and criteria to distinguish âunicellularityâ, âsimpleâ (SM) and âcomplexâ multicellularity. Rather than using the limited available estimations of unique cell types, the 461 species were classified according to our criteria by reviewing their life cycle and body plan development from literature. Then, I investigated the evolutionary association between genome size and 35 genome-wide features (introns and exons from protein-coding genes, repeats and intergenic regions) describing the coding and ncDNA complexities of the 461 genomes. To that end, I developed âGenomeContentâ, a program that systematically retrieves massive multidimensional datasets from gene annotations and calculates over 100 genome-wide statistics. R-scripts coupled to parallel computing were created to calculate >260,000 phylogenetic controlled pairwise correlations. As previously reported, both repetitive and non-repetitive DNA are found to be scaling strongly and positively with genome size across most eukaryotic lineages. Contrasting previous studies, I demonstrate that changes in the length and repeat composition of introns are only weakly or moderately associated with changes in genome size at the global phylogenetic scale, while changes in intron abundance (within and across genes) are either not or only very weakly associated with changes in genome size. Our evolutionary correlations are robust to: different phylogenetic regression methods, uncertainties in the tree of eukaryotes, variations in genome size estimates, and randomly reduced datasets. Then, I investigated the correlation between the 35 genome-wide features and the cellular complexity of the 461 eukaryotes with phylogenetic Principal Component Analyses. Our results endorse a genetic distinction between SM and CM in Archaeplastida and Metazoa, but not so clearly in Fungi. Remarkably, complex multicellular organisms and their closest ancestral relatives are characterized by high intron-richness, regardless of genome size. Finally, I argue why and how a vast expansion of non-coding RNA (ncRNA) regulators rather than of novel protein regulators can promote the emergence of CM in Eukarya. As a proof of concept, I co-developed a novel âceRNA-motif pipelineâ for the prediction of âcompeting endogenousâ ncRNAs (ceRNAs) that regulate microRNAs in plants. We identified three candidate ceRNAs motifs: MIM166, MIM171 and MIM159/319, which were found to be conserved across land plants and be potentially involved in diverse developmental processes and stress responses. Collectively, the findings of this dissertation support our hypothesis that CM on Earth is a major evolutionary transition promoted by the expansion of two major ncDNA classes, introns and regulatory ncRNAs, which might have boosted the irreversible commitment of cell types in certain lineages by canalizing the timing and kinetics of the eukaryotic transcriptome.:Cover page
Abstract
Acknowledgements
Index
1. The structure of this thesis
1.1. Structure of this PhD dissertation
1.2. Publications of this PhD dissertation
1.3. Computational infrastructure and resources
1.4. Disclosure of financial support and information use
1.5. Acknowledgements
1.6. Author contributions and use of impersonal and personal pronouns
2. Biological background
2.1. The complexity of the eukaryotic genome
2.2. The problem of counting and defining âgenesâ in eukaryotes
2.3. The âfunctionâ concept for genes and âdark matterâ
2.4. Increases of organismal complexity on Earth through multicellularity
2.5. Multicellularity is a âfitness transitionâ in individuality
2.6. The complexity of cell differentiation in multicellularity
3. Technical background
3.1. The Phylogenetic Comparative Method (PCM)
3.2. RNA secondary structure prediction
3.3. Some standards for genome and gene annotation
4. What is in a eukaryotic genome? GenomeContent provides a good answer
4.1. Background
4.2. Motivation: an interoperable tool for data retrieval of gene annotations
4.3. Methods
4.4. Results
4.5. Discussion
5. The evolutionary correlation between genome size and ncDNA
5.1. Background
5.2. Motivation: estimating the relationship between genome size and ncDNA
5.3. Methods
5.4. Results
5.5. Discussion
6. The relationship between non-coding DNA and Complex Multicellularity
6.1. Background
6.2. Motivation: How to define and measure complex multicellularity across eukaryotes?
6.3. Methods
6.4. Results
6.5. Discussion
7. The ceRNA motif pipeline: regulation of microRNAs by target mimics
7.1. Background
7.2. A revisited protocol for the computational analysis of Target Mimics
7.3. Motivation: a novel pipeline for ceRNA motif discovery
7.4. Methods
7.5. Results
7.6. Discussion
8. Conclusions and outlook
8.1. Contributions and lessons for the bioinformatics of large-scale comparative analyses
8.2. Intron features are evolutionarily decoupled among themselves and from genome size throughout Eukarya
8.3. âComplex multicellularityâ is a major evolutionary transition
8.4. Role of RNA throughout the evolution of life and complex multicellularity on Earth
9. Supplementary Data
Bibliography
Curriculum Scientiae
SelbstÀndigkeitserklÀrung (declaration of authorship
Comparing genomic variant identification protocols for Candida auris.
Genomic analyses are widely applied to epidemiological, population genetic and experimental studies of pathogenic fungi. A wide range of methods are employed to carry out these analyses, typically without including controls that gauge the accuracy of variant prediction. The importance of tracking outbreaks at a global scale has raised the urgency of establishing high-accuracy pipelines that generate consistent results between research groups. To evaluate currently employed methods for whole-genome variant detection and elaborate best practices for fungal pathogens, we compared how 14 independent variant calling pipelines performed across 35 Candida auris isolates from 4 distinct clades and evaluated the performance of variant calling, single-nucleotide polymorphism (SNP) counts and phylogenetic inference results. Although these pipelines used different variant callers and filtering criteria, we found high overall agreement of SNPs from each pipeline. This concordance correlated with site quality, as SNPs discovered by a few pipelines tended to show lower mapping quality scores and depth of coverage than those recovered by all pipelines. We observed that the major differences between pipelines were due to variation in read trimming strategies, SNP calling methods and parameters, and downstream filtration criteria. We calculated specificity and sensitivity for each pipeline by aligning three isolates with chromosomal level assemblies and found that the GATK-based pipelines were well balanced between these metrics. Selection of trimming methods had a greater impact on SAMtools-based pipelines than those using GATK. Phylogenetic trees inferred by each pipeline showed high consistency at the clade level, but there was more variability between isolates from a single outbreak, with pipelines that used more stringent cutoffs having lower resolution. This project generated two truth datasets useful for routine benchmarking of C. auris variant calling, a consensus VCF of genotypes discovered by 10 or more pipelines across these 35 diverse isolates and variants for 2 samples identified from whole-genome alignments. This study provides a foundation for evaluating SNP calling pipelines and developing best practices for future fungal genomic studies
Advances in Forensic Genetics
The book has 25 articles about the status and new directions in forensic genetics. Approximately half of the articles are invited reviews, and the remaining articles deal with new forensic genetic methods. The articles cover aspects such as sampling DNA evidence at the scene of a crime; DNA transfer when handling evidence material and how to avoid DNA contamination of items, laboratory, etc.; identification of body fluids and tissues with RNA; forensic microbiome analysis with molecular biology methods as a supplement to the examination of human DNA; forensic DNA phenotyping for predicting visible traits such as eye, hair, and skin colour; new ancestry informative DNA markers for estimating ethnic origin; new genetic genealogy methods for identifying distant relatives that cannot be identified with conventional forensic DNA typing; sensitive DNA methods, including single-cell DNA analysis and other highly specialised and sensitive methods to examine ancient DNA from unidentified victims of war; forensic animal genetics; genetics of visible traits in dogs; statistical tools for interpreting forensic DNA analyses, including the most used IT tools for forensic STR-typing and DNA sequencing; haploid markers (Y-chromosome and mitochondria DNA); inference of ethnic origin; a comprehensive logical framework for the interpretation of forensic genetic DNA data; and an overview of the ethical aspects of modern forensic genetics
Wheat Improvement
This open-access textbook provides a comprehensive, up-to-date guide for students and practitioners wishing to access in a single volume the key disciplines and principles of wheat breeding. Wheat is a cornerstone of food security: it is the most widely grown of any crop and provides 20% of all human calories and protein. The authorship of this book includes world class researchers and breeders whose expertise spans cutting-edge academic science all the way to impacts in farmersâ fields. The bookâs themes and authors were selected to provide a didactic work that considers the background to wheat improvement, current mainstream breeding approaches, and translational research and avant garde technologies that enable new breakthroughs in science to impact productivity. While the volume provides an overview for professionals interested in wheat, many of the ideas and methods presented are equally relevant to small grain cereals and crop improvement in general. The book is affordable, and because it is open access, can be readily shared and translated -- in whole or in part -- to university classes, members of breeding teams (from directors to technicians), conference participants, extension agents and farmers. Given the challenges currently faced by academia, industry and national wheat programs to produce higher crop yields --- often with less inputs and under increasingly harsher climates -- this volume is a timely addition to their toolkit
The Evolutionary Genetics of Venoms: How Nature Created the Perfect Chemical Weapon
Venomous animals have fascinated humans for millennia. How nature shaped a simple biological secretion into a potent chemical weapon is a testament to evolutionâs power and versatility. However, the early origins and genetic mechanisms of venom evolution are not clearly understood. Venoms consist of proteinaceous cocktails where each protein can be mapped to a specific gene; I utilized this genetic tractability to uncover the molecular and genetic mechanisms behind its evolution. Using a combination of quantitative genetics, transcriptomics, and phylogenetics, I have identified specific mechanisms that led to the origin of oral venoms in mammals and reptiles. Oral venoms originated from an ancient conserved gene regulatory network whose primary role was maintaining cellular homeostasis during increased protein production. This ancient system could tolerate high protein loads, facilitating the parallel recruitment of various diverse protein families into the ancient venom. Venom complexity then increased by sequence and copy number variation of toxins. High copy numbers contributed to this systemâs phenotypic flexibility, allowing it to further diversify through changes in evolutionary rates and by altering the combinations of toxins used. These features enabled evolution to refine venom cocktails to form optimal formulations. I provide the first unified and deep evolutionary model describing the early steps in forming a venom system and show how millions of years of evolution produced venom phenotypes in extant lineages. All chapters of this thesis have been peer-reviewed and published.Okinawa Institute of Science and Technology Graduate Universit
- âŠ