250 research outputs found

    MetaBinG: Using GPUs to Accelerate Metagenomic Sequence Classification

    Get PDF
    Metagenomic sequence classification is a procedure to assign sequences to their source genomes. It is one of the important steps for metagenomic sequence data analysis. Although many methods exist, classification of high-throughput metagenomic sequence data in a limited time is still a challenge. We present here an ultra-fast metagenomic sequence classification system (MetaBinG) using graphic processing units (GPUs). The accuracy of MetaBinG is comparable to the best existing systems and it can classify a million of 454 reads within five minutes, which is more than 2 orders of magnitude faster than existing systems. MetaBinG is publicly available at http://cbb.sjtu.edu.cn/~ccwei/pub/software/MetaBinG/MetaBinG.php

    A walk into the luxR regulators of actinobacteria : phylogenomic distribution and functional diversity

    Get PDF
    LuxR regulators are a widely studied group of bacterial helix-turn-helix (HTH) transcription factors involved in the regulation of many genes coding for important traits at an ecological and medical level. This regulatory family is particularly known by their involvement in quorum-sensing (QS) mechanisms, i.e., in the bacterial ability to communicate through the synthesis and binding of molecular signals. However, these studies have been mainly focused on Gram-negative organisms, and the presence of LuxR regulators in the Gram-positive Actinobacteria phylum is still poorly explored. In this manuscript, the presence of LuxR regulators among Actinobacteria was assayed using a domain-based strategy. A total of 991 proteins having one LuxR domain were identified in 53 genome-sequenced actinobacterial species, of which 59% had an additional domain. In most cases (53%) this domain was REC (receiver domain), suggesting that LuxR regulators in Actinobacteria may either function as single transcription factors or as part of two-component systems. The frequency, distribution and evolutionary stability of each of these sub-families of regulators was analyzed and contextualized regarding the ecological niche occupied by each organism. The results show that the presence of extra-domains in the LuxR-regulators was likely driven by a general need to physically uncouple the signal sensing from the signal transduction. Moreover, the total frequency of LuxR regulators was shown to be dependent on genetic, metabolic and ecological variables. Finally, the functional annotation of the LuxR regulators revealed that the bacterial ecological niche has biased the specialization of these proteins. In the case of pathogens, our results suggest that LuxR regulators can be involved in virulence and are therefore promising targets for future studies in the health-related biotechnology field.Fundação para a CiΓͺncia e TecnologiaEuropean Regional Development Fund - COMPETE program and FCT - FundacΓ£o para a CiΓͺncia e Tecnologia, with the project FCOMP-01-0124-FEDER-022718 (ref FCT Pest-C/SAU/LA0002/2011). MVM was supported by β€˜β€˜Programa CiΓͺncia 2007’’ sponsored by POPH QREN Tipologia 4.2 Promoção do Emprego Cientifico program and co-financed by the European Social Fund and Portuguese national funds from the MCTES. CLS was supported by the FCT fellowship SFRH/BPD/62978/2009

    The Agaricus bisporus cox1 Gene: The Longest Mitochondrial Gene and the Largest Reservoir of Mitochondrial Group I Introns

    Get PDF
    In eukaryotes, introns are located in nuclear and organelle genes from several kingdoms. Large introns (up to 5 kbp) are frequent in mitochondrial genomes of plant and fungi but scarce in Metazoa, even if these organisms are grouped with fungi among the Opisthokonts. Mitochondrial introns are classified in two groups (I and II) according to their RNA secondary structure involved in the intron self-splicing mechanism. Most of these mitochondrial group I introns carry a β€œHoming Endonuclease Gene” (heg) encoding a DNA endonuclease acting in transfer and site-specific integration (β€œhoming”) and allowing intron spreading and gain after lateral transfer even between species from different kingdoms. Opposed to this gain mechanism, is another which implies that introns, which would have been abundant in the ancestral genes, would mainly evolve by loss. The importance of both mechanisms (loss and gain) is matter of debate. Here we report the sequence of the cox1 gene of the button mushroom Agaricus bisporus, the most widely cultivated mushroom in the world. This gene is both the longest mitochondrial gene (29,902 nt) and the largest group I intron reservoir reported to date with 18 group I and 1 group II. An exhaustive analysis of the group I introns available in cox1 genes shows that they are mobile genetic elements whose numerous events of loss and gain by lateral transfer combine to explain their wide and patchy distribution extending over several kingdoms. An overview of intron distribution, together with the high frequency of eroded heg, suggests that they are evolving towards loss. In this landscape of eroded and lost intron sequences, the A. bisporus cox1 gene exhibits a peculiar dynamics of intron keeping and catching, leading to the largest collection of mitochondrial group I introns reported to date in a Eukaryote

    Molecular Diversity of Fungal Phylotypes Co-Amplified Alongside Nematodes from Coastal and Deep-Sea Marine Environments

    Get PDF
    Nematodes and fungi are both ubiquitous in marine environments, yet few studies have investigated relationships between these two groups. Microbial species share many well-documented interactions with both free-living and parasitic nematode species, and limited data from previous studies have suggested ecological associations between fungi and nematodes in benthic marine habitats. This study aimed to further document the taxonomy and distribution of fungal taxa often co-amplified from nematode specimens. A total of 15 fungal 18S rRNA phylotypes were isolated from nematode specimens representing both deep-sea and shallow water habitats; all fungal isolates displayed high pairwise sequence identities with published data in Genbank (99–100%) and unpublished high-throughput 454 environmental datasets (>95%). BLAST matches indicate marine fungal sequences amplified in this study broadly represent taxa within the phyla Ascomycota and Basidiomycota, and several phylotypes showed robust groupings with known taxa in phylogenetic topologies. In addition, some fungal phylotypes appeared to be present in disparate geographic habitats, suggesting cosmopolitan distributions or closely related species complexes in at least some marine fungi. The present study was only able to isolate fungal DNA from a restricted set of nematode taxa; further work is needed to fully investigate the taxonomic scope and function of nematode-fungal interactions

    Genome content and phylogenomics reveal both ancestral and lateral evolutionary pathways in plant-pathogenic Streptomyces species

    Get PDF
    Β© 2016, American Society for Microbiology. All Rights Reserved. Streptomyces spp. are highly differentiated actinomycetes with large, linear chromosomes that encode an arsenal of biologically active molecules and catabolic enzymes. Members of this genus are well equipped for life in nutrient-limited environments and are common soil saprophytes. Out of the hundreds of species in the genus Streptomyces, a small group has evolved the ability to infect plants. The recent availability of Streptomyces genome sequences, including four genomes of pathogenic species, provided an opportunity to characterize the gene content specific to these pathogens and to study phylogenetic relationships among them. Genome sequencing, comparative genomics, and phylogenetic analysis enabled us to discriminate pathogenic from saprophytic Streptomyces strains; moreover, we calculated that the pathogen-specific genome contains 4,662 orthologs. Phylogenetic reconstruction suggested that Streptomyces scabies and S. ipomoeae share an ancestor but that their biosynthetic clusters encoding the required virulence factor thaxtomin have diverged. In contrast, S. turgidiscabies and S. acidiscabies, two relatively unrelated pathogens, possess highly similar thaxtomin biosynthesis clusters, which suggests that the acquisition of these genes was through lateral gene transfer

    An Alignment-Free Approach for Eukaryotic ITS2 Annotation and Phylogenetic Inference

    Get PDF
    The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate

    A format for phylogenetic placements

    Full text link
    We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g. short reads) into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements, and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement.Comment: Documents version 3 of the forma

    Complete Genome Sequence of the Aerobic CO-Oxidizing Thermophile Thermomicrobium roseum

    Get PDF
    In order to enrich the phylogenetic diversity represented in the available sequenced bacterial genomes and as part of an β€œAssembling the Tree of Life” project, we determined the genome sequence of Thermomicrobium roseum DSM 5159. T. roseum DSM 5159 is a red-pigmented, rod-shaped, Gram-negative extreme thermophile isolated from a hot spring that possesses both an atypical cell wall composition and an unusual cell membrane that is composed entirely of long-chain 1,2-diols. Its genome is composed of two circular DNA elements, one of 2,006,217 bp (referred to as the chromosome) and one of 919,596 bp (referred to as the megaplasmid). Strikingly, though few standard housekeeping genes are found on the megaplasmid, it does encode a complete system for chemotaxis including both chemosensory components and an entire flagellar apparatus. This is the first known example of a complete flagellar system being encoded on a plasmid and suggests a straightforward means for lateral transfer of flagellum-based motility. Phylogenomic analyses support the recent rRNA-based analyses that led to T. roseum being removed from the phylum Thermomicrobia and assigned to the phylum Chloroflexi. Because T. roseum is a deep-branching member of this phylum, analysis of its genome provides insights into the evolution of the Chloroflexi. In addition, even though this species is not photosynthetic, analysis of the genome provides some insight into the origins of photosynthesis in the Chloroflexi. Metabolic pathway reconstructions and experimental studies revealed new aspects of the biology of this species. For example, we present evidence that T. roseum oxidizes CO aerobically, making it the first thermophile known to do so. In addition, we propose that glycosylation of its carotenoids plays a crucial role in the adaptation of the cell membrane to this bacterium's thermophilic lifestyle. Analyses of published metagenomic sequences from two hot springs similar to the one from which this strain was isolated, show that close relatives of T. roseum DSM 5159 are present but have some key differences from the strain sequenced

    Soybean aphid biotype 1 genome: Insights into the invasive biology and adaptive evolution of a major agricultural pest

    Get PDF
    The soybean aphid, Aphis glycines Matsumura (Hemiptera: Aphididae) is a serious pest of the soybean plant, Glycine max, a major world-wide agricultural crop. We assembled a de novo genome sequence of Ap. glycines Biotype 1, from a culture established shortly after this species invaded North America. 20.4% of the Ap. glycines proteome is duplicated. These in-paralogs are enriched with Gene Ontology (GO) categories mostly related to apoptosis, a possible adaptation to plant chemistry and other environmental stressors. Approximately one-third of these genes show parallel duplication in other aphids. But Ap. gossypii, its closest related species, has the lowest number of these duplicated genes. An Illumina GoldenGate assay of 2380 SNPs was used to determine the world-wide population structure of Ap. Glycines. China and South Korean aphids are the closest to those in North America. China is the likely origin of other Asian aphid populations. The most distantly related aphids to those in North America are from Australia. The diversity of Ap. glycines in North America has decreased over time since its arrival. The genetic diversity of Ap. glycines North American population sampled shortly after its first detection in 2001 up to 2012 does not appear to correlate with geography. However, aphids collected on soybean Rag experimental varieties in Minnesota (MN), Iowa (IA), and Wisconsin (WI), closer to high density Rhamnus cathartica stands, appear to have higher capacity to colonize resistant soybean plants than aphids sampled in Ohio (OH), North Dakota (ND), and South Dakota (SD). Samples from the former states have SNP alleles with high FST values and frequencies, that overlap with genes involved in iron metabolism, a crucial metabolic pathway that may be affected by the Rag-associated soybean plant response. The Ap. glycines Biotype 1 genome will provide needed information for future analyses of mechanisms of aphid virulence and pesticide resistance as well as facilitate comparative analyses between aphids with differing natural history and host plant range

    The Complete Genome Sequence of Haloferax volcanii DS2, a Model Archaeon

    Get PDF
    a key model organism, not only for the study of halophilicity, but also for archaeal biology in general. DS2, the type strain of this species. The genome contains a main 2.848 Mb chromosome, three smaller chromosomes pHV1, 3, 4 (85, 438, 636 kb, respectively) and the pHV2 plasmid (6.4 kb).
    • …
    corecore