101 research outputs found
PlutoF pilv - elurikkuse andmebaaside ja analüüsiplatvorm bioloogile
Väitekirja elektrooniline versioon ei sisalda publikatsioone.Doktoritöö eesmärgiks oli luua platvorm, mis võimaldaks üle veebi sisestada, talletada, toimetada ja analüüsida elurikkuse ning sellega seotud metaandmestikku. Töö kitsamaks eesmärgiks oli luua spetsiifilised lahendused seente molekulaarsete andmete (DNA järjestuste) analüüsimiseks ja nende DNA-põhiseks määramiseks.
Paljud seeneliigid ei moodusta määramiseks kasutatavaid viljakehi elades vaid seeneniidistikuna mullas. Seetõttu nende eristamiseks ja kindlakstegemiseks morfoloogilistest ning anatoomilistest tunnustest ei piisa. Viimased viisteist aastat on seeneliikide määramiseks keskkonnaproovidest kasutatud kindla DNA regiooni nukleotiidseid järjestusi. Selliste DNA järjestuste määramiseks tuleb neid võrrelda teadaolevate liikide järjestustega. Viimased pärinevad reeglina eksperdi poolt määratud seene viljakehast või eluskultuurist. Sellise DNA-põhise määramise kitsaskohaks on seni olnud kõrge kvaliteediga määratud DNA järjestuste andmebaasi puudumine, mille vastu uusi järjestusi võrrelda.
Käesoleva töö raames loodi seente viljakehadest pärit kvaliteetseid DNA järjestusi sisaldav andmebaas UNITE (http://unite.ut.ee) koos selle veebiväljundi ja analüüsitarkvaraga seente kiireks ja usaldusväärseks DNA-põhiseks määramiseks. Andmebaas sisaldab DNA järjestusi seente viljakehadest, mis on kogutud oma ala ekspertide poolt üle maailma. Lisaks on võimalik analüüsidesse kaasata ka kõiki rahvusvahelistesse geenipankadesse lisatud keskkonnaproovidest pärit DNA järjestusi, mille metaandmestikku rahvusvahelise uurimisrühma poolt meie süsteemis kontrollitakse ja täiendatakse. Doktoritöö raames loodi mitmed tarkvaralahendused vastavate DNA järjestuste kvaliteedi kontrollimiseks ning veebi-töölaud PlutoF (http://plutof.ut.ee) kogu andmestiku haldamiseks ja analüüsimiseks. Ka UNITE andmebaas asub PlutoF pilves.
PlutoF platvorm võimaldab lisaks eelpool kirjeldatud näitele luua ja hallata mitmesuguseid elurikkuse andmebaase taksonoomiast keskkonnagenoomikani. PlutoF pilve on edukalt kasutatud andmebaaside Eesti eElurikkus ja Eesti Liikide Register loomiseks (http://elurikkus.ut.ee) ning Eesti loodusteaduslike kogude rahvuslike andmebaaside (http://elurikkus.ut.ee/collections.php?lang=est) loomiseks. Erinevad, sh. rahvusvahelised uurimisrühmad kasutavad PlutoF pilve ökoloogiliste uuringute haldamiseks ning molekulaarsete andmete analüüsimiseks.The aim of this thesis was to develop technologies for managing, editing and analyzing biodiversity data. The main focus was put on developing web-based tools for molecular identification of fungi.
Many fungi do not form fruit-bodies which are needed for morphological identification. Instead they live in soil as fungal hyphae. For this reason the identification of fungal species in ecological studies is nowadays most commonly DNA-based – DNA sequences from unknown species are compared against DNA sequences from known species which are often identified by an expert based on fungal fruit-body or living culture. For DNA-based identification the presence of a reference database consisting of correctly identified high-quality DNA sequences for comparison is needed.
As a result of this thesis, UNITE – a database storing high-quality fungal DNA sequences generated from fruiting bodies collected and identified by experts together with public web page and analysis tools for fast and reliable identification of fungi was developed (http://unite.ut.ee). In addition to reference dataset there’s also an option to include all fungal DNA sequences deposited in international gene repositories in the identification process. These DNA sequences are stored locally in our system where their quality is checked, identifications verified and metadata complemented. As a result of this thesis, several software tools for checking sequence quality, and a web-based workbench PlutoF (http://plutof.ut.ee) for managing and analyzing these data were developed.
PlutoF cloud enables to create and manage various biodiversity databases from taxonomy to environmental genomics. It has been successfully used for creating Estonian eBiodiversity database and Estonian Species Registry (http://elurikkus.ut.ee), and Estonian national databases of natural history collections (http://elurikkus.ut.ee/collections.php?lang=est). Several international workgroups use PlutoF cloud for managing ecological studies and analyzing molecular data
ToxGen: An improved reference database for the identification of type B-trichothecene genotypes in Fusarium
Type B trichothecenes, which pose a serious hazard to consumer health, occur worldwide in grains. These mycotoxins are produced mainly by three different trichothecene genotypes/chemotypes: 3ADON (3-acetyldeoxynivalenol), 15ADON (15-acetyldeoxynivalenol) and NIV (nivalenol), named after these three major mycotoxin compounds. Correct identification of these genotypes is elementary for all studies relating to population surveys, fungal ecology and mycotoxicology. Trichothecene producers exhibit enormous strain-dependent chemical diversity, which may result in variation in levels of the genotype´s determining toxin and in the production of low to high amounts of atypical compounds. New high-throughput DNA-sequencing technologies promise to boost the diagnostics of mycotoxin genotypes. However, this requires a reference database containing a satisfactory taxonomic sampling of sequences showing high correlation to actually produced chemotypes. We believe that one of the most pressing current challenges of such a database is the linking of molecular identification with chemical diversity of the strains, as well as other metadata. In this study, we use the Tri12 gene involved in mycotoxin biosynthesis for identification of Tri genotypes through sequence comparison. Tri12 sequences from a range of geographically diverse fungal strains comprising 22 Fusarium species were stored in the ToxGen database, which covers descriptive and up-to-date annotations such as indication on Tri genotype and chemotype of the strains, chemical diversity, information on trichothecene-inducing host, substrate or media, geographical locality, and most recent taxonomic affiliations. The present initiative bridges the gap between the demands of comprehensive studies on trichothecene producers and the existing nucleotide sequence databases, which lack toxicological and other auxiliary data. We invite researchers working in the fields of fungal taxonomy, epidemiology and mycotoxicology to join the freely available annotation effort.Fil: Kulik, Tomasz. Uniwersytet Warminsko-mazurski W Olsztynie;Fil: Abarenkov, Kessy. University Of Tartu.; EstoniaFil: Busko, Maciej. Poznań University of Life Sciences; PoloniaFil: Bilska, Katarzyna. University of Warmia and Mazury; PoloniaFil: van Diepeningen, Anne D.. University of Amsterdam; Países BajosFil: Ostrowska-Kolodziejczak, Anna. Poznań University of Life Science; PoloniaFil: Krawczyk, Katarzyna. University of Warmia and Mazur; PoloniaFil: Brankovics, Balázs. CBS-KNAW Fungal Biodiversity Centre; Países Bajos. University of Amsterdam; Países BajosFil: Stenglein, Sebastian Alberto. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Cientifico Tecnolológico Mar del Plata. Instituto de Investigaciones en Biodiversidad y Biotecnología. Laboratorio de Biología Funcional y Biotecnología; ArgentinaFil: Sawicki, Jakub. University of Warmia and Mazury; PoloniaFil: Perkowski, Juliusz. Poznań University of Life Sciences; Poloni
Tidying up international nucleotide sequence databases
Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi
Incorporating molecular data in fungal systematics: a guide for aspiring researchers
The last twenty years have witnessed molecular data emerge as a primary
research instrument in most branches of mycology. Fungal systematics, taxonomy,
and ecology have all seen tremendous progress and have undergone rapid,
far-reaching changes as disciplines in the wake of continual improvement in DNA
sequencing technology. A taxonomic study that draws from molecular data
involves a long series of steps, ranging from taxon sampling through the
various laboratory procedures and data analysis to the publication process. All
steps are important and influence the results and the way they are perceived by
the scientific community. The present paper provides a reflective overview of
all major steps in such a project with the purpose to assist research students
about to begin their first study using DNA-based methods. We also take the
opportunity to discuss the role of taxonomy in biology and the life sciences in
general in the light of molecular data. While the best way to learn molecular
methods is to work side by side with someone experienced, we hope that the
present paper will serve to lower the learning threshold for the reader.Comment: Submitted to Current Research in Environmental and Applied Mycology -
comments most welcom
Protax-fungi : a web-based tool for probabilistic taxonomic placement of fungal internal transcribed spacer sequences
Incompleteness of reference sequence databases and unresolved taxonomic relationships complicates taxonomic placement of fungal sequences. We developed Protax-fungi, a general tool for taxonomic placement of fungal internal transcribed spacer (ITS) sequences, and implemented it into the PlutoF platform of the UNITE database for molecular identification of fungi. With empirical data on root- and wood-associated fungi, Protax-fungi reliably identified (with at least 90% identification probability) the majority of sequences to the order level but only around one-fifth of them to the species level, reflecting the current limited coverage of the databases. Protax-fungi outperformed the Sintax and Rdb classifiers in terms of increased accuracy and decreased calibration error when applied to data on mock communities representing species groups with poor sequence database coverage. We applied Protax-fungi to examine the internal consistencies of the Index Fungorum and UNITE databases. This revealed inconsistencies in the taxonomy database as well as mislabelling and sequence quality problems in the reference database. The according improvements were implemented in both databases. Protax-fungi provides a robust tool for performing statistically reliable identifications of fungi in spite of the incompleteness of extant reference sequence databases and unresolved taxonomic relationships.Peer reviewe
The UNITE database for molecular identification of fungi : handling dark taxa and parallel taxonomic classifications
Alfred P. Sloan Foundation [G-2015-14062]; Swedish Research Council of Environment, Agricultural Sciences, and Spatial Planning [FORMAS, 215-2011-498]; European Regional Development Fund (Centre of Excellence EcolChange) [TK131]; Estonian Research Council [IUT20-30]. Funding for open access charge: Swedish Research Council of Environment, Agricultural Sciences and Spatial Planning.Peer reviewedPublisher PD
The curse of the uncultured fungus
The international DNA sequence databases abound in fungal sequences not annotated beyond the kingdom level, typically bearing names such as "uncultured fungus". These sequences beget lowresolution mycological results and invite further deposition of similarly poorly annotated entries. What do these sequences represent? This study uses a 767,918-sequence corpus of public full-length that represent truly unidentifiable fungal taxa - and what proportion of them that would have deposition. Our results suggest that more than 70% of these sequences would have been trivial to identify to at least the order/family level at the time of sequence deposition, hinting that factors other than poor availability of relevant reference sequences explain the low-resolution names. We speculate that researchers' perceived lack of time and lack of insight into the ramifications of this problem are the main explanations for the low-resolution names. We were surprised to find that more than a fifth of these sequences seem to have been deposited by mycologists rather than researchers unfamiliar with the consequences of poorly annotated fungal sequences in molecular repositories. The proportion of these needlessly poorly annotated sequences does not decline over time, suggesting that this problem must not be left unchecked
PlutoF—a Web Based Workbench for Ecological and Taxonomic Research, with an Online Implementation for Fungal ITS Sequences
DNA sequences accumulating in the International Nucleotide Sequence Databases (INSD) form a rich source of information for taxonomic and ecological meta-analyses. However, these databases include many erroneous entries, and the data itself is poorly annotated with metadata, making it difficult to target and extract entries of interest with any degree of precision. Here we describe the web-based workbench PlutoF, which is designed to bridge the gap between the needs of contemporary research in biology and the existing software resources and databases. Built on a relational database, PlutoF allows remote-access rapid submission, retrieval, and analysis of study, specimen, and sequence data in INSD as well as for private datasets though web-based thin clients. In contrast to INSD, PlutoF supports internationally standardized terminology to allow very specific annotation and linking of interacting specimens and species. The sequence analysis module is optimized for identification and analysis of environmental ITS sequences of fungi, but it can be modified to operate on any genetic marker and group of organisms. The workbench is available at http://plutof.ut.ee
- …