19 research outputs found

    BlaSTorage: a fast package to parse, manage and store BLAST results

    Get PDF
    Background: Large-scale sequence studies requiring BLAST-based analysis produce huge amounts of data to be parsed. BLAST parsers are available, but they are often missing some important features, such as keeping all information from the raw BLAST output, allowing direct access to single results, and performing logical operations over them. Findings: We implemented BlaSTorage, a Python package that parses multi BLAST results and returns them in a purpose-built object-database format. Unlike other BLAST parsers, BlaSTorage retains and stores all parts of BLAST results, including alignments, without loss of information; a complete API allows access to all the data components. Conclusions: BlaSTorage shows comparable speed of more basic parser written in compiled languages as C++ and can be easily integrated into web applications or software pipelines.Pubblicat

    PlantGDB: a resource for comparative plant genomics

    Get PDF
    PlantGDB (http://www.plantgdb.org/) is a genomics database encompassing sequence data for green plants (Viridiplantae). PlantGDB provides annotated transcript assemblies for >100 plant species, with transcripts mapped to their cognate genomic context where available, integrated with a variety of sequence analysis tools and web services. For 14 plant species with emerging or complete genome sequence, PlantGDB's genome browsers (xGDB) serve as a graphical interface for viewing, evaluating and annotating transcript and protein alignments to chromosome or bacterial artificial chromosome (BAC)-based genome assemblies. Annotation is facilitated by the integrated yrGATE module for community curation of gene models. Novel web services at PlantGDB include Tracembler, an iterative alignment tool that generates contigs from GenBank trace file data and BioExtract Server, a web-based server for executing custom sequence analysis workflows. PlantGDB also hosts a plant genomics research outreach portal (PGROP) that facilitates access to a large number of resources for research and training

    The BioExtract Server: a web-based bioinformatic workflow platform

    Get PDF
    The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet

    De novo Assembly and Analysis of the Northern Leopard Frog Rana pipiens Transcriptome

    Get PDF
    The northern leopard frog Rana (Lithobates) pipiens is an important animal model, being used extensively in cancer, neurology, physiology, and biomechanical studies. R. pipiens is a native North American frog whose range extends from northern Canada to southwest United States, but over the past few decades its populations have declined significantly and is now considered uncommon in large portions of the United States and Canada. To aid in the study and conservation of R. pipiens, this paper describes the first R. pipiens transcriptome. The R. pipiens transcriptome was annotated using Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Eukaryotic Orthologous Groups (KOG). Differential expression analysis revealed universal and tissue specific genes, and endocrine-related genes were identified. Transcriptome assemblies and other sequence data are available for download

    Sal-Site: Integrating new and existing ambystomatid salamander research and informational resources

    Get PDF
    Salamanders of the genus Ambystoma are a unique model organism system because they enable natural history and biomedical research in the laboratory or field. We developed Sal-Site to integrate new and existing ambystomatid salamander research resources in support of this model system. Sal-Site hosts six important resources: 1) Salamander Genome Project: an information-based web-site describing progress in genome resource development, 2) Ambystoma EST Database: a database of manually edited and analyzed contigs assembled from ESTs that were collected from A. tigrinum tigrinum and A. mexicanum, 3) Ambystoma Gene Collection: a database containing full-length protein-coding sequences, 4) Ambystoma Map and Marker Collection: an image and database resource that shows the location of mapped markers on linkage groups, provides information about markers, and provides integrating links to Ambystoma EST Database and Ambystoma Gene Collection databases, 5) Ambystoma Genetic Stock Center: a website and collection of databases that describe an NSF funded salamander rearing facility that generates and distributes biological materials to researchers and educators throughout the world, and 6) Ambystoma Research Coordination Network: a web-site detailing current research projects and activities involving an international group of researchers. Sal-Site is accessible at

    A Profile of Putative Parasitism Genes Expressed in the Esophageal Gland Cells of the Root-knot Nematode Meloidogyne incognita

    Get PDF
    Identifying parasitism genes encoding proteins secreted from a nematode\u27s esophageal gland cells and injected through its stylet into plant tissue is the key to understanding the molecular basis of nematode parasitism of plants. Meloidogyne incognita parasitism genes were cloned by microaspirating the cytoplasm from the esophageal gland cells of different parasitic stages to provide mRNA to create a gland cell-specific cDNA library by long-distance reverse-transcriptase polymerase chain reaction. Of 2,452 cDNA clones sequenced, deduced protein sequences of 185 cDNAs had a signal peptide for secretion and, thus, could have a role in root-knot nematode parasitism of plants. High-throughput in situ hybridization with cDNA clones encoding signal peptides resulted in probes of 37 unique clones specifically hybridizing to transcripts accumulating within the subventral (13 clones) or dorsal (24 clones) esophageal gland cells of M. incognita. In BLASTP analyses, 73% of the predicted proteins were novel proteins. Those with similarities to known proteins included a pectate lyase, acid phosphatase, and hypothetical proteins from other organisms. Our cell-specific analysis of genes encoding secretory proteins provided, for the first time, a profile of putative parasitism genes expressed in the M. incognita esophageal gland cells throughout the parasitic cycle

    The ASRG database: identification and survey of Arabidopsis thaliana genes involved in pre-mRNA splicing

    Get PDF
    A total of 74 small nuclear RNA (snRNA) genes and 395 genes encoding splicing-related proteins were identified in the Arabidopsis genome by sequence comparison and motif searches, including the previously elusive U4atac snRNA gene. Most of the genes have not been studied experimentally. Classification of these genes and detailed information on gene structure, alternative splicing, gene duplications and phylogenetic relationships are made accessible as a comprehensive database of Arabidopsis Splicing Related Genes (ASRG) on our website
    corecore