103 research outputs found
PHI-base update: additions to the pathogen–host interaction database
The pathogen–host interaction database (PHI-base) is a web-accessible database that catalogues experimentally verified pathogenicity, virulence and effector genes from bacterial, fungal and Oomycete pathogens, which infect human, animal, plant, insect, fish and fungal hosts. Plant endophytes are also included. PHI-base is therefore an invaluable resource for the discovery of genes in medically and agronomically important pathogens, which may be potential targets for chemical intervention. The database is freely accessible to both academic and non-academic users. This publication describes recent additions to the database and both current and future applications. The number of fields that characterize PHI-base entries has almost doubled. Important additional fields deal with new experimental methods, strain information, pathogenicity islands and external references that link the database to external resources, for example, gene ontology terms and Locus IDs. Another important addition is the inclusion of anti-infectives and their target genes that makes it possible to predict the compounds, that may interact with newly identified virulence factors. In parallel, the curation process has been improved and now involves several external experts. On the technical side, several new search tools have been provided and the database is also now distributed in XML format. PHI-base is available at: http://www.phi-base.org/
El proyecto genómico del hongo Ophiostoma
The Canadian Ophiostoma Genome Project, which was initiated in 2001, is a collaborative effort between research teams in four different universities. Its general objective is to conduct a large-scale identification and analysis of genes controlling important aspects of the life cycle of Ophiostomatoid fungi. To this end, several expressed sequence tag (EST) libraries were obtained for the Dutch elm disease pathogen Ophiostoma novo-ulmi and the sapstainer O. piceae, following partial, single-pass automated sequencing of complementary DNA clones. The largest EST library, prepared from yeast like cells of O. novo-ulmi grown at 24 °C, contains over 3,400 readable sequences and serves as a general reference library for Ophiostomatoid fungi. Smaller, specific EST libraries were constructed from mycelia of O. novo-ulmi grown at suboptimal temperatures, from perithecia formed in laboratory crosses, as well as from O. piceae grown on different carbon sources. Ongoing bioinformatic searches in public databases have so far identified over 750 Ophiostoma unique ESTs which show significant homology with other fungal genes of known function, although a high proportion of Ophiostoma ESTs are either orphans (no match to any known gene) or show homology to genes of unknown function. In addition to EST analysis, differential expression of selected genes and structural genomics are also being studied.El programa canadiense sobre el genoma de Ophiostoma, iniciado en 2001, es una colaboración entre equipos de investigación de cuatro universidades diferentes. Su objetivo general es el desarrollo de la identificación y análisis a gran escala de los genes que controlan algunos aspectos importantes del ciclo vital de los hongos de Ophiostoma. Con este fin, se ha obtenido diversas bibliotecas de marcadores de secuencias expresadas (bibliotecas EST) para la el patógeno de la grafiosis Ophiostoma novo-ulmi y para el hongo de tinción vascular O. piceae, seguido de una secuenciación automática parcial de un único paso de clones complementarios de ADN. La mayor biblioteca EST, preparada a partir de conidios de O. novo-ulmi cultivadas a 24 ºC, contiene más de 3.400 secuencias legíbles, y sirve como biblioteca de referencia para los hongos de Ophiostoma. Se han desarrollado bibliotecas específicas menores a partir de micelios de O. novo-ulmi cultivados a temperaturas sub-óptimas, a partir de los peritecios formados en cruces realizados en laboratorio, así como a partir de O. piceae cultivado en distintas fuentes de carbón. Las búsquedas bioinformáticas en bases de datos públicas han permitido identificar hasta ahora más de 750 EST exclusivos de Ophiostoma, lo que muestra una significativa homología con otros genes fúngicos de función conocida, aunque una alta proporción de los EST de Ophiostoma son o bien huérfanos (no relacionados con ningún gen conocido), o bien muestran homología con genes cuya función es desconocida. Además del análisis EST, la expresión diferencial de genes seleccionados y la estructura genómica están siendo también estudiadas
TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets
<p>Abstract</p> <p>Background</p> <p>Sequencing metagenomes that were pre-amplified with primer-based methods requires the removal of the additional tag sequences from the datasets. The sequenced reads can contain deletions or insertions due to sequencing limitations, and the primer sequence may contain ambiguous bases. Furthermore, the tag sequence may be unavailable or incorrectly reported. Because of the potential for downstream inaccuracies introduced by unwanted sequence contaminations, it is important to use reliable tools for pre-processing sequence data.</p> <p>Results</p> <p>TagCleaner is a web application developed to automatically identify and remove known or unknown tag sequences allowing insertions and deletions in the dataset. TagCleaner is designed to filter the trimmed reads for duplicates, short reads, and reads with high rates of ambiguous sequences. An additional screening for and splitting of fragment-to-fragment concatenations that gave rise to artificial concatenated sequences can increase the quality of the dataset. Users may modify the different filter parameters according to their own preferences.</p> <p>Conclusions</p> <p>TagCleaner is a publicly available web application that is able to automatically detect and efficiently remove tag sequences from metagenomic datasets. It is easily configurable and provides a user-friendly interface. The interactive web interface facilitates export functionality for subsequent data processing, and is available at <url>http://edwards.sdsu.edu/tagcleaner</url>.</p
Genome-wide SNP identification by high-throughput sequencing and selective mapping allows sequence assembly positioning using a framework genetic linkage map
<p>Abstract</p> <p>Background</p> <p>Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method.</p> <p>Results</p> <p>The strategy was tested on a draft genome of the fungal pathogen <it>Venturia inaequalis</it>, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome <it>Fragaria vesca</it>. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for <it>V. inaequalis </it>and <it>F. vesca</it>, respectively, to genetic linkage maps.</p> <p>Conclusions</p> <p>We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.</p
diArk 2.0 provides detailed analyses of the ever increasing eukaryotic genome sequencing data
<p>Abstract</p> <p>Background</p> <p>Nowadays, the sequencing of even the largest mammalian genomes has become a question of days with current next-generation sequencing methods. It comes as no surprise that dozens of genome assemblies are released per months now. Since the number of next-generation sequencing machines increases worldwide and new major sequencing plans are announced, a further increase in the speed of releasing genome assemblies is expected. Thus it becomes increasingly important to get an overview as well as detailed information about available sequenced genomes. The different sequencing and assembly methods have specific characteristics that need to be known to evaluate the various genome assemblies before performing subsequent analyses.</p> <p>Results</p> <p>diArk has been developed to provide fast and easy access to all sequenced eukaryotic genomes worldwide. Currently, diArk 2.0 contains information about more than 880 species and more than 2350 genome assembly files. Many meta-data like sequencing and read-assembly methods, sequencing coverage, GC-content, extended lists of alternatively used scientific names and common species names, and various kinds of statistics are provided. To intuitively approach the data the web interface makes extensive usage of modern web techniques. A number of search modules and result views facilitate finding and judging the data of interest. Subscribing to the RSS feed is the easiest way to stay up-to-date with the latest genome data.</p> <p>Conclusions</p> <p>diArk 2.0 is the most up-to-date database of sequenced eukaryotic genomes compared to databases like GOLD, NCBI Genome, NHGRI, and ISC. It is different in that only those projects are stored for which genome assembly data or considerable amounts of cDNA data are available. Projects in planning stage or in the process of being sequenced are not included. The user can easily search through the provided data and directly access the genome assembly files of the sequenced genome of interest. diArk 2.0 is available at <url>http://www.diark.org</url>.</p
Evaluation of Methods for De Novo Genome Assembly from High-Throughput Sequencing Reads Reveals Dependencies That Affect the Quality of the Results
Recent developments in high-throughput sequencing technology have made low-cost sequencing an attractive approach for many genome analysis tasks. Increasing read lengths, improving quality and the production of increasingly larger numbers of usable sequences per instrument-run continue to make whole-genome assembly an appealing target application. In this paper we evaluate the feasibility of de novo genome assembly from short reads (≤100 nucleotides) through a detailed study involving genomic sequences of various lengths and origin, in conjunction with several of the currently popular assembly programs. Our extensive analysis demonstrates that, in addition to sequencing coverage, attributes such as the architecture of the target genome, the identity of the used assembly program, the average read length and the observed sequencing error rates are powerful variables that affect the best achievable assembly of the target sequence in terms of size and correctness
Sequence-specific error profile of Illumina sequencers
We identified the sequence-specific starting positions of consecutive miscalls in the mapping of reads obtained from the Illumina Genome Analyser (GA). Detailed analysis of the miscall pattern indicated that the underlying mechanism involves sequence-specific interference of the base elongation process during sequencing. The two major sequence patterns that trigger this sequence-specific error (SSE) are: (i) inverted repeats and (ii) GGC sequences. We speculate that these sequences favor dephasing by inhibiting single-base elongation, by: (i) folding single-stranded DNA and (ii) altering enzyme preference. This phenomenon is a major cause of sequence coverage variability and of the unfavorable bias observed for population-targeted methods such as RNA-seq and ChIP-seq. Moreover, SSE is a potential cause of false single-nucleotide polymorphism (SNP) calls and also significantly hinders de novo assembly. This article highlights the importance of recognizing SSE and its underlying mechanisms in the hope of enhancing the potential usefulness of the Illumina sequencers
Genomics, Lifestyles and Future Prospects of Wood-Decay and Litter-Decomposing Basidiomycota
Saprobic (saprotrophic and saprophytic) wood-decay fungi are in majority species belonging to the fungal phylum Basidiomycota, whereas saprobic plant litter-decomposing fungi are species of both the Basidiomycota and the second Dikarya phylum Ascomycota. Wood-colonizing white rot and brown rot fungi are principally polypore, gilled pleurotoid, or corticioid Basidiomycota species of the class Agaricomycetes, which also includes forest and grassland soil-inhabiting and litter-decomposing mushroom species. In this chapter, examples of lignocellulose degradation patterns are presented in the current view of genome sequencing and comparative genomics of fungal wood-decay enzymes. Specific attention is given to the model white rot fungus, lignin-degrading species Phanerochaete chrysosporium and its wood decay-related gene expression (transcriptomics) on lignocellulose substrates. Types of fungal decay patterns on wood and plant lignocellulose are discussed in the view of fungal lifestyle strategies. Potentiality of the plant biomass-decomposing Basidiomycota species, their secreted enzymes and respective lignocellulose-attacking genes is evaluated in regard to development of biotechnological and industrial applications.Peer reviewe
RNA-seq analyses of blood-induced changes in gene expression in the mosquito vector species, Aedes aegypti
<p>Abstract</p> <p>Background</p> <p>Hematophagy is a common trait of insect vectors of disease. Extensive genome-wide transcriptional changes occur in mosquitoes after blood meals, and these are related to digestive and reproductive processes, among others. Studies of these changes are expected to reveal molecular targets for novel vector control and pathogen transmission-blocking strategies. The mosquito <it>Aedes aegypti </it>(Diptera, Culicidae), a vector of Dengue viruses, Yellow Fever Virus (YFV) and Chikungunya virus (CV), is the subject of this study to look at genome-wide changes in gene expression following a blood meal.</p> <p>Results</p> <p>Transcriptional changes that follow a blood meal in <it>Ae. aegypti </it>females were explored using RNA-seq technology. Over 30% of more than 18,000 investigated transcripts accumulate differentially in mosquitoes at five hours after a blood meal when compared to those fed only on sugar. Forty transcripts accumulate only in blood-fed mosquitoes. The list of regulated transcripts correlates with an enhancement of digestive activity and a suppression of environmental stimuli perception and innate immunity. The alignment of more than 65 million high-quality short reads to the <it>Ae. aegypti </it>reference genome permitted the refinement of the current annotation of transcript boundaries, as well as the discovery of novel transcripts, exons and splicing variants. <it>Cis</it>-regulatory elements (CRE) and <it>cis</it>-regulatory modules (CRM) enriched significantly at the 5'end flanking sequences of blood meal-regulated genes were identified.</p> <p>Conclusions</p> <p>This study provides the first global view of the changes in transcript accumulation elicited by a blood meal in <it>Ae. aegypti </it>females. This information permitted the identification of classes of potentially co-regulated genes and a description of biochemical and physiological events that occur immediately after blood feeding. The data presented here serve as a basis for novel vector control and pathogen transmission-blocking strategies including those in which the vectors are modified genetically to express anti-pathogen effector molecules.</p
Design, Validation and Annotation of Transcriptome-Wide Oligonucleotide Probes for the Oligochaete Annelid Eisenia fetida
High density oligonucleotide probe arrays have increasingly become an important tool in genomics studies. In organisms with incomplete genome sequence, one strategy for oligo probe design is to reduce the number of unique probes that target every non-redundant transcript through bioinformatic analysis and experimental testing. Here we adopted this strategy in making oligo probes for the earthworm Eisenia fetida, a species for which we have sequenced transcriptome-scale expressed sequence tags (ESTs). Our objectives were to identify unique transcripts as targets, to select an optimal and non-redundant oligo probe for each of these target ESTs, and to annotate the selected target sequences. We developed a streamlined and easy-to-follow approach to the design, validation and annotation of species-specific array probes. Four 244K-formatted oligo arrays were designed using eArray and were hybridized to a pooled E. fetida cRNA sample. We identified 63,541 probes with unsaturated signal intensities consistently above the background level. Target transcripts of these probes were annotated using several sequence alignment algorithms. Significant hits were obtained for 37,439 (59%) probed targets. We validated and made publicly available 63.5K oligo probes so the earthworm research community can use them to pursue ecological, toxicological, and other functional genomics questions. Our approach is efficient, cost-effective and robust because it (1) does not require a major genomics core facility; (2) allows new probes to be easily added and old probes modified or eliminated when new sequence information becomes available, (3) is not bioinformatics-intensive upfront but does provide opportunities for more in-depth annotation of biological functions for target genes; and (4) if desired, EST orthologs to the UniGene clusters of a reference genome can be identified and selected in order to improve the target gene specificity of designed probes. This approach is particularly applicable to organisms with a wealth of EST sequences but unfinished genome
- …