44,621 research outputs found

    Windows .NET Network Distributed Basic Local Alignment Search Toolkit (W.ND-BLAST)

    Get PDF
    BACKGROUND: BLAST is one of the most common and useful tools for Genetic Research. This paper describes a software application we have termed Windows .NET Distributed Basic Local Alignment Search Toolkit (W.ND-BLAST), which enhances the BLAST utility by improving usability, fault recovery, and scalability in a Windows desktop environment. Our goal was to develop an easy to use, fault tolerant, high-throughput BLAST solution that incorporates a comprehensive BLAST result viewer with curation and annotation functionality. RESULTS: W.ND-BLAST is a comprehensive Windows-based software toolkit that targets researchers, including those with minimal computer skills, and provides the ability increase the performance of BLAST by distributing BLAST queries to any number of Windows based machines across local area networks (LAN). W.ND-BLAST provides intuitive Graphic User Interfaces (GUI) for BLAST database creation, BLAST execution, BLAST output evaluation and BLAST result exportation. This software also provides several layers of fault tolerance and fault recovery to prevent loss of data if nodes or master machines fail. This paper lays out the functionality of W.ND-BLAST. W.ND-BLAST displays close to 100% performance efficiency when distributing tasks to 12 remote computers of the same performance class. A high throughput BLAST job which took 662.68 minutes (11 hours) on one average machine was completed in 44.97 minutes when distributed to 17 nodes, which included lower performance class machines. Finally, there is a comprehensive high-throughput BLAST Output Viewer (BOV) and Annotation Engine components, which provides comprehensive exportation of BLAST hits to text files, annotated fasta files, tables, or association files. CONCLUSION: W.ND-BLAST provides an interactive tool that allows scientists to easily utilizing their available computing resources for high throughput and comprehensive sequence analyses. The install package for W.ND-BLAST is freely downloadable from . With registration the software is free, installation, networking, and usage instructions are provided as well as a support forum

    Circlator: automated circularization of genome assemblies using long sequencing reads

    Get PDF
    The assembly of DNA sequence data is undergoing a renaissance thanks to emerging technologies capable of producing reads tens of kilobases long. Assembling complete bacterial and small eukaryotic genomes is now possible, but the final step of circularizing sequences remains unsolved. Here we present Circlator, the first tool to automate assembly circularization and produce accurate linear representations of circular sequences. Using Pacific Biosciences and Oxford Nanopore data, Circlator correctly circularized 26 of 27 circularizable sequences, comprising 11 chromosomes and 12 plasmids from bacteria, the apicoplast and mitochondrion of Plasmodium falciparum and a human mitochondrion. Circlator is available at http://sanger-pathogens.github.io/circlator/

    Intestinal spirochaetes of the genus Brachyspira share a partially conserved 26 kilobase genomic region with Enterococcus faecalis and Escherichia coli

    Get PDF
    Anaerobic intestinal spirochaetes of the genus Brachyspira include both pathogenic and commensal species. The two best-studied members are the pathogenic species B. hyodysenteriae (the aetiological agent of swine dysentery) and B. pilosicoli (a cause of intestinal spirochaetosis in humans and other species). Analysis of near-complete genome sequences of these two species identifi ed a highly conserved 26 kilobase (kb) region that was shared, against a background of otherwise very little sequence conservation between the two species. PCR amplification was used to identify sets of contiguous genes from this region in the related Brachyspira species B. intermedia, B. innocens, B. murdochii, B. alvinipulli, and B. aalborgi, and demonstrated the presence of at least part of this region in species from throughout the genus. Comparative genomic analysis with other sequenced bacterial species revealed that none of the completely sequenced spirochaete species from different genera contained this conserved cluster of coding sequences. In contrast, Enterococcus faecalis and Escherichia coli contained high gene cluster conservation across the 26 kb region, against an expected background of little sequence conservation between these phylogenetically distinct species. The conserved region in B. hyodysenteriae contained five genes predicted to be associated with amino acid transport and metabolism, four with energy production and conversion, two with nucleotide transport and metabolism, one with ion transport and metabolism, and four with poorly characterised or uncertain function, including an ankyrin repeat unit at the 5’ end. The most likely explanation for the presence of this 26 kb region in the Brachyspira species and in two unrelated enteric bacterial species is that the region has been involved in horizontal gene transfer

    A de novo reference transcriptome for Bolitoglossa vallecula, an Andean mountain salamander in Colombia

    Get PDF
    © The Author(s), 2020. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Arenas Gomez, C. M., Woodcock, M. R., Smith, J. J., Voss, S. R., & Delgado, J. P. A de novo reference transcriptome for Bolitoglossa vallecula, an Andean mountain salamander in Colombia. Data in Brief, 29, (2020): 105256, doi:10.1016/j.dib.2020.105256.The amphibian order Caudata, contains several important model species for biological research. However, there is need to generate transcriptome data from representative species of the primary salamander families. Here we describe a de novo reference transcriptome for a terrestrial salamander, Bolitoglossa vallecula (Caudata: Plethodontidae). We employed paired-end (PE) illumina RNA sequencing to assemble a de novo reference transcriptome for B. vallecula. Assembled transcripts were compared against sequences from other vertebrate taxa to identify orthologous genes, and compared to the transcriptome of a close plethodontid relative (Bolitoglossa ramosi) to identify commonly expressed genes in the skin. This dataset should be useful to future comparative studies aimed at understanding important biological process, such as immunity, wound healing, and the production of antimicrobial compounds.This work was funded by a research grant from COLCIENCIAS 569 (GRANT 027-2103) and CODI (Programa Sostenibilidad) 2013–2014 of the University of Antioquia. A PhD fellowship to the first author, Claudia Arenas was funded by the COLCIENCIAS 567 Grant. We thank the lab of Juan Fernando Alzate from the University of Antioquia for their help in developing our bioinformatic methodological approach. We thank Andrea Gómez and Melisa Hincapie for their help in animal collection and husbandry

    PPNID : a reference database and molecular identification pipeline for plant-parasitic nematodes

    Get PDF
    Motivation: The phylum Nematoda comprises the most cosmopolitan and abundant metazoans on Earth and plant-parasitic nematodes represent one of the most significant nematode groups, causing severe losses in agriculture. Practically, the demands for accurate nematode identification are high for ecological, agricultural, taxonomic and phylogenetic researches. Despite their importance, the morphological diagnosis is often a difficult task due to phenotypic plasticity and the absence of clear diagnostic characters while molecular identification is very difficult due to the problematic database and complex genetic background. Results: The present study attempts to make up for currently available databases by creating a manually-curated database including all up-to-date authentic barcoding sequences. To facilitate the laborious process associated with the interpretation and identification of a given query sequence, we developed an automatic software pipeline for rapid species identification. The incorporated alignment function facilitates the examination of mutation distribution and therefore also reveals nucleotide autapomorphies, which are important in species delimitation. The implementation of genetic distance, plot and maximum likelihood phylogeny analysis provides more powerful optimality criteria than similarity searching and facilitates species delimitation using evolutionary or phylogeny species concepts. The pipeline streamlines several functions to facilitate more precise data analyses, and the subsequent interpretation is easy and straightforward

    Conserved noncoding sequences highlight shared components of regulatory networks in dicotyledonous plants

    Get PDF
    Conserved noncoding sequences (CNSs) in DNA are reliable pointers to regulatory elements controlling gene expression. Using a comparative genomics approach with four dicotyledonous plant species (Arabidopsis thaliana, papaya [Carica papaya], poplar [Populus trichocarpa], and grape [Vitis vinifera]), we detected hundreds of CNSs upstream of Arabidopsis genes. Distinct positioning, length, and enrichment for transcription factor binding sites suggest these CNSs play a functional role in transcriptional regulation. The enrichment of transcription factors within the set of genes associated with CNS is consistent with the hypothesis that together they form part of a conserved transcriptional network whose function is to regulate other transcription factors and control development. We identified a set of promoters where regulatory mechanisms are likely to be shared between the model organism Arabidopsis and other dicots, providing areas of focus for further research
    corecore