130 research outputs found

    Occurrence, mating strategy, and pathogenicity of members of Nectriaceae in Central Appalachia

    Get PDF
    Members of the Nectriaceae occupy many ecological niches including dominant canker pathogens, such as Neonectria ditissima and N. faginata. These two pathogens contribute to the decline of American beech (Fagus grandifolia) forests across the Appalachian Mountains due to beech bark disease (BBD). Interestingly, N. ditissima represents a well-known canker pathogen many hardwood species, while N. faginata has not been observed outside of BBD. Additionally, N. faginata occurs at higher incidences than N. ditissima in BBD stands. Nectriaceae in Central Appalachia were surveyed as to further characterize the diversity and possibly identify a non-beech host of N. faginata. This resulted in the recovery of ten nectriaceous species from twelve tree species. Neonectria faginata only occurred on BBD trees. Neonectria ditissima was recovered eight tree species including Acer spicatum, Ilex mucronata, and Sorbus americana. Fusarium babinda was often recovered from BBD trees, but its role in BBD remains unclear. Corinectria gaudineerii sp. nov. was recovered from Picea rubens and Neonectria magnoliae comb. nov. from cankered Liriodendron tulipifera and Magnolia fraseri. The pathogenicity of N. magnoliae was confirmed, but the pathogenicity of C. gaudineerii was less apparent. Heterothallism for N. ditissima, N. faginata, and a number of other Nectriaceae was confirmed using molecular data and in vitro assays. This was important different mating strategies might explain differences in the ecology of N. faginata and N. ditissima. Together, these results demonstrate the diversity of Nectriaceae in eastern North America and their mating strategies as to further our understanding of dominant diseases affecting Appalachian forests

    Connecting GOMMA with STROMA: an approach for semantic ontology mapping in the biomedical domain

    Get PDF
    This thesis establishes a connection between GOMMA and STROMA – both are tools of ontology processing. Consequently, a new workflow of denoting a set of correspondences with five semantic relation types has been implemented. Such a rich denotation is scarcely discussed within the literature. The evaluation of the denotation shows that trivial correspondences are easy to recognize (tF > 90). The challenge is the denotation of non-trivial types ( 30 < ntF < 70). A prerequisite of the implemented workflow is the extraction of semantic relations between concepts. These relations represent additional background knowledge for the enrichment tool STROMA and are integrated to the repository SemRep which is accessed by this tool. Thus, STROMA is able to calculate a semantic type more precisely. UMLS was chosen as a biomedical knowledge source because it subsumes many different ontologies of this domain and thus, it represents a rich resource. Nevertheless, only a small set of relations met the requirements which are imposed to SemRep relations. Further studies may analyze whether there is an appropriate way to integrate the missing relations as well. The connection of GOMMA with STROMA allows the semantic enrichment of a biomedical mapping. As a consequence, this thesis enlightens two subjects of research. First, STROMA had been tested with general ontologies, which models common sense knowledge. Within this thesis, STROMA was applied to domain ontologies. Studies have shown that overall, STROMA was able to treat such ontologies as well. However, some strategies for the enrichment process are based on assumption which are misleading in the biomedical domain. Consequently, further strategies are suggested in this thesis which might improve the type denotation. These strategies may lead to an optimization of STROMA for biomedical data sets. A more thorough analysis will review their scope, also beyond the biomedical domain. Second, the established connection may lead to deeper investigations about advantages of semantic enrichment in the biomedical domain as an enriched mapping is returned. Despite heterogeneity of source and target ontology, such a mapping results in an improved interoperability at a finer level of granularity. The utilization of semantically rich correspondences in the biomedical domain is a worthwhile focus for future research

    Identification and analysis of noncoding genetic elements in plant genomes

    Get PDF
    The goal of this dissertation is to identify and analyze two of the noncoding genetic elements, microRNAs (miRNAs) and intorns, in plant genomes. miRNAs are a class of short noncoding RNAs of which some are shown to regulate gene expression at the post-transcriptional level by complementary base pairing to their target mRNAs. In the early 2000s, a large number of miRNAs were cloned in animals, plants and viruses. Complementary efforts also sought to identify miRNA genes computationally. In the dissertation, I developed a computational method to identify miRNA genes and their target mRNAs in Arabidopsis. Experiments were then performed to validate some of the new miRNA genes and the miRNA-target interaction. The study facilitates the identification and characterization of conserved and non-conserved miRNAs in plants. Another noncoding element discussed in the dissertation is intron. The removal of introns from precursor mRNAs (pre-mRNAs), which is called pre-mRNA splicing, is essential to produce mature mRNAs and proteins. Alternative splicing (AS) occurs when different patterns of splicing result from the same pre-mRNA. AS is important in regulated gene expression and has various effects on mRNAs and proteins. In the dissertation, I am interested in studying the role of splice site sequences in intron evolution and AS. In one chapter, software is developed to identify conserved intron positions within orthologous genes. I demonstrated its application to a set of plant-specific orthologous genes. In another chapter, I have developed a computational approach to identify transcript-confirmed introns and genes in 15 plant species and analyzed intron evolution in the context of orthology. The results indicate dynamic evolution of introns with different splice sites and the significance of splice site sequences during intron evolution. In a third chapter, by using the transcript-confirmed data, I identified AS introns and events in the 15 plant species and studied their behavior and effect on protein sequences. The findings underscore the important role of splice site sequences in AS regulation. In conclusion, the identification of transcript-confirmed introns and the study of intron evolution and AS provide insight into the significant role of splice site sequences in intron evolution and AS

    Computational methods for rapid structural modelling of antibody-antigen interactions to improve identification of antigen-specific antibodies from BCR-seq data

    Get PDF
    Antibodies are immune proteins that are the basis of humoral immunity in jawed vertebrates, permitting highly-specific, mutable and lasting recognition of diverse foreign molecules. In their membrane-bound form, they are referred to as B cell receptors (BCRs): the collection of antibodies or BCRs in an individual is a record of their antigenic history, and tells us how their B cells have recognised and interacted with pathogens over their lifespan. Since the advent of next-generation sequencing and its application to these repertoires, we can sample and sequence the receptors of 103 to 106 B cells from a given individual. However, there remains an unmet need for mapping the mounting number of these BCR sequences to their putative antigen specificity. In this thesis, we aim to demonstrate how computational structural methods can be applied to BCR repertoires to improve our ability to identify antigen-specific antibodies. In the first chapter, we describe a novel method for identifying antigenspecific antibodies from repertoire data using paratope prediction which we call "paratyping", and then use paratyping to discover novel Pertussis toxoidbinding antibodies from the BCR repertoires of transgenic mice. Transgenic mice are a common source of therapeutic antibodies. To improve our understanding of the structural landscape of the BCR repertoires of these workhorses of antibody discovery, in the second chapter we apply structural annotation and modelling approaches to naive repertoires from humans, mice and transgenic mice. We show that the starting structural repertoires of transgenic mice are intermediate between humans and mice, despite being encoded by human genes, as a result of deficient junctional diversification in the primary repertoire of transgenic mice. In the third chapter, we describe the creation of a novel sequence database for Ebolavirus-binding antibodies called EBOV-AbDab. Using this data alongside other techniques, we analysed the longitudinal BCR repertoires of 40 individuals vaccinated with Ad26.ZEBOV/MVA-BN-Filo. We used paratyping to predict the epitopes of the majority of the most expanded clonotypes after the booster vaccine, providing evidence of a highly convergent response to Ebolavirus vaccination that correlated with anti-Ebolavirus glycoprotein IgG titre. Finally, in the fourth chapter we examine the growing body of nativelypaired data to try to understand VH:VL pairing preferences. We re-examine the VH:VL interface in the context of germline gene usage and subsequently analyse at scale the genetic pairing preference of antibodies in the largest paired sequencing repository, the Observed Antibody Space, in combination with 3x as much novel data produced by an alternative single-cell sequencing technology. In the final chapter, I outline further work which can extend or complete the work presented in the previous chapters, as well as general outlooks for the field

    Machine learning methods for MicroRNA target prediction

    Get PDF
    MicroRNAs are small non-coding RNA molecules that form a post-transcriptional layer of gene regulation. microRNA binds with messenger RNA in order to repress translation and accelerate its degradation, ultimately downregulating the expression of genes. The mechanics of these bindings in animals are complex and entrenched in a myriad of contextual factors which influence the specificity and efficacy of potential interactions. This thesis describes the development of miRsight, a novel target prediction tool utilising advanced machine learning techniques. miRsight is trained using 44 target recognition features compiled through testing on published microRNA-transfected RNA sequencing data, an experimental procedure in which microRNA molecules are introduced into a sample to quantify their impact on gene expression. In addition to the tool itself, a database of pre-computed predictions is hosted at https://mirsight.info, which also provides search, filter, and export functionality for user convenience. The results of this study indicate that miRsight is able to more effectively predict and rank microRNA targets compared to popular target prediction tools. This is validated by examining the downregulation of gene expression from predicted targets using microRNA transfection. In the 12 samples reserved for testing, miRsight is shown to more consistently identify true targets in the top 100, 300 and 500 of predictions by rank compared to TargetScan, MirTarget and DIANA-microT. Additionally, miRsight is capable of producing several thousand total predictions for each microRNA while maintaining this high rate of prediction accuracy

    NOVEL COMPUTATIONAL METHODS FOR SEQUENCING DATA ANALYSIS: MAPPING, QUERY, AND CLASSIFICATION

    Get PDF
    Over the past decade, the evolution of next-generation sequencing technology has considerably advanced the genomics research. As a consequence, fast and accurate computational methods are needed for analyzing the large data in different applications. The research presented in this dissertation focuses on three areas: RNA-seq read mapping, large-scale data query, and metagenomics sequence classification. A critical step of RNA-seq data analysis is to map the RNA-seq reads onto a reference genome. This dissertation presents a novel splice alignment tool, MapSplice3. It achieves high read alignment and base mapping yields and is able to detect splice junctions, gene fusions, and circular RNAs comprehensively at the same time. Based on MapSplice3, we further extend a novel lightweight approach called iMapSplice that enables personalized mRNA transcriptional profiling. As huge amount of RNA-seq has been shared through public datasets, it provides invaluable resources for researchers to test hypotheses by reusing existing datasets. To meet the needs of efficiently querying large-scale sequencing data, a novel method, called SeqOthello, has been developed. It is able to efficiently query sequence k-mers against large-scale datasets and finally determines the existence of the given sequence. Metagenomics studies often generate tens of millions of reads to capture the presence of microbial organisms. Thus efficient and accurate algorithms are in high demand. In this dissertation, we introduce MetaOthello, a probabilistic hashing classifier for metagenomic sequences. It supports efficient query of a taxon using its k-mer signatures

    Doctor of Philosophy

    Get PDF
    dissertationSynthetic biology is a new field in which engineers, biologists, and chemists are working together to transform genetic engineering into an advanced engineering discipline, one in which the design and construction of novel genetic circuits are made possible through the application of engineering principles. This dissertation explores two engineering strategies to address the challenges of working with genetic technology, namely the development of standards for describing genetic components and circuits at separate yet connected levels of detail and the use of Genetic Design Automation (GDA) software tools to simplify and speed up the process of optimally designing genetic circuits. Its contributions to the field of synthetic biology include (1) a proposal for the next version of the Synthetic Biology Open Language (SBOL), an existing standard for specifying and exchanging genetic designs electronically, and (2) a GDA work ow that enables users of the software tool iBioSim to create an abstract functional specication, automatically select genetic components that satisfy the specication from a design library, and compose the selected components into a standardized genetic circuit design for subsequent analysis and physical construction. Ultimately, this dissertation demonstrates how existing techniques and concepts from electrical and computer engineering can be adapted to overcome the challenges of genetic design and is an example of what is possible when working with publicly available standards for genetic design

    A blood atlas of COVID-19 defines hallmarks of disease severity and specificity.

    Get PDF
    Treatment of severe COVID-19 is currently limited by clinical heterogeneity and incomplete description of specific immune biomarkers. We present here a comprehensive multi-omic blood atlas for patients with varying COVID-19 severity in an integrated comparison with influenza and sepsis patients versus healthy volunteers. We identify immune signatures and correlates of host response. Hallmarks of disease severity involved cells, their inflammatory mediators and networks, including progenitor cells and specific myeloid and lymphocyte subsets, features of the immune repertoire, acute phase response, metabolism, and coagulation. Persisting immune activation involving AP-1/p38MAPK was a specific feature of COVID-19. The plasma proteome enabled sub-phenotyping into patient clusters, predictive of severity and outcome. Systems-based integrative analyses including tensor and matrix decomposition of all modalities revealed feature groupings linked with severity and specificity compared to influenza and sepsis. Our approach and blood atlas will support future drug development, clinical trial design, and personalized medicine approaches for COVID-19
    • …
    corecore