88 research outputs found

    HiSpOD: probe design for functional DNA microarrays.

    Get PDF
    International audienceMOTIVATION: The use of DNA microarrays allows the monitoring of the extreme microbial diversity encountered in complex samples like environmental ones as well as that of their functional capacities. However, no probe design software currently available is adapted to easily design efficient and explorative probes for functional gene arrays. RESULTS: We present a new efficient functional microarray probe design algorithm called HiSpOD (High Specific Oligo Design). This uses individual nucleic sequences or consensus sequences produced by multiple alignments to design highly specific probes. Indeed, to bypass crucial problem of cross-hybridizations, probe specificity is assessed by similarity search against a large formatted database dedicated to microbial communities containing about 10 million coding sequences (CDS). For experimental validation, a microarray targeting genes encoding enzymes involved in chlorinated solvent biodegradation was built. The results obtained from a contaminated environmental sample proved the specificity and the sensitivity of probes designed with the HiSpOD program. AVAILABILITY: http://fc.isima.fr/~g2im/hispod/

    Gene Capture Coupled to High-Throughput Sequencing as a Strategy for Targeted Metagenome Exploration

    Get PDF
    International audienceNext-generation sequencing (NGS) allows faster acquisition of metagenomic data, but complete exploration of complex ecosystems is hindered by the extraordinary diversity of microorganisms. To reduce the environmental complexity, we created an innovative solution hybrid selection (SHS) method that is combined with NGS to characterize large DNA fragments harbouring biomarkers of interest. The quality of enrichment was evaluated after fragments containing the methyl coenzyme M reductase subunit A gene (mcrA), the biomarker of methanogenesis, were captured from a Methanosarcina strain and a metagenomic sample from a meromictic lake. The methanogen diversity was compared with direct metagenome and mcrA-based amplicon pyrosequencing strategies. The SHS approach resulted in the capture of DNA fragments up to 2.5 kb with an enrichment efficiency between 41 and 100%, depending on the sample complexity. Compared with direct metagenome and amplicons sequencing, SHS detected broader mcrA diversity, and it allowed efficient sampling of the rare biosphere and unknown sequences. In contrast to amplicon-based strategies, SHS is less biased and GC independent, and it recovered complete biomarker sequences in addition to conserved regions. Because this method can also isolate the regions flanking the target sequences, it could facilitate operon reconstructions

    Identification of transcriptional signals in Encephalitozoon cuniculi widespread among Microsporidia phylum: support for accurate structural genome annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microsporidia are obligate intracellular eukaryotic parasites with genomes ranging in size from 2.3 Mbp to more than 20 Mbp. The extremely small (2.9 Mbp) and highly compact (~1 gene/kb) genome of the human parasite <it>Encephalitozoon cuniculi </it>has been fully sequenced. The aim of this study was to characterize noncoding motifs that could be involved in regulation of gene expression in <it>E. cuniculi </it>and to show whether these motifs are conserved among the phylum Microsporidia.</p> <p>Results</p> <p>To identify such signals, 5' and 3'RACE-PCR experiments were performed on different E. cuniculi mRNAs. This analysis confirmed that transcription overrun occurs in E. cuniculi and may result from stochastic recognition of the AAUAAA polyadenylation signal. Such experiments also showed highly reduced 5'UTR's (<7 nts). Most of the <it>E. cuniculi </it>genes presented a CCC-like motif immediately upstream from the coding start. To characterize other signals involved in differential transcriptional regulation, we then focused our attention on the gene family coding for ribosomal proteins. An AAATTT-like signal was identified upstream from the CCC-like motif. In rare cases the cytosine triplet was shown to be substituted by a GGG-like motif. Comparative genomic studies confirmed that these different signals are also located upstream from genes encoding ribosomal proteins in other microsporidian species including <it>Antonospora locustae</it>, <it>Enterocytozoon bieneusi</it>, <it>Anncaliia algerae </it>(syn. <it>Brachiola algerae</it>) and <it>Nosema ceranae</it>. Based on these results a systematic analysis of the ~2000 E. cuniculi coding DNA sequences was then performed and brings to highlight that 364 translation initiation codons (18.29% of total CDSs) had been badly predicted.</p> <p>Conclusion</p> <p>We identified various signals involved in the maturation of E. cuniculi mRNAs. Presence of such signals, in phylogenetically distant microsporidian species, suggests that a common regulatory mechanism exists among the microsporidia. Furthermore, 5'UTRs being strongly reduced, these signals can be used to ensure the accurate prediction of translation initiation codons for microsporidian genes and to improve microsporidian genome annotation.</p

    Detecting variants with Metabolic Design, a new software tool to design probes for explorative functional DNA microarray development

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microorganisms display vast diversity, and each one has its own set of genes, cell components and metabolic reactions. To assess their huge unexploited metabolic potential in different ecosystems, we need high throughput tools, such as functional microarrays, that allow the simultaneous analysis of thousands of genes. However, most classical functional microarrays use specific probes that monitor only known sequences, and so fail to cover the full microbial gene diversity present in complex environments. We have thus developed an algorithm, implemented in the user-friendly program Metabolic Design, to design efficient explorative probes.</p> <p>Results</p> <p>First we have validated our approach by studying eight enzymes involved in the degradation of polycyclic aromatic hydrocarbons from the model strain <it>Sphingomonas paucimobilis </it>sp. EPA505 using a designed microarray of 8,048 probes. As expected, microarray assays identified the targeted set of genes induced during biodegradation kinetics experiments with various pollutants. We have then confirmed the identity of these new genes by sequencing, and corroborated the quantitative discrimination of our microarray by quantitative real-time PCR. Finally, we have assessed metabolic capacities of microbial communities in soil contaminated with aromatic hydrocarbons. Results show that our probe design (sensitivity and explorative quality) can be used to study a complex environment efficiently.</p> <p>Conclusions</p> <p>We successfully use our microarray to detect gene expression encoding enzymes involved in polycyclic aromatic hydrocarbon degradation for the model strain. In addition, DNA microarray experiments performed on soil polluted by organic pollutants without prior sequence assumptions demonstrate high specificity and sensitivity for gene detection. Metabolic Design is thus a powerful, efficient tool that can be used to design explorative probes and monitor metabolic pathways in complex environments, and it may also be used to study any group of genes. The Metabolic Design software is freely available from the authors and can be downloaded and modified under general public license.</p

    ASaiM: A Galaxy-based framework to analyze microbiota data

    Get PDF
    Background: New generations of sequencing platforms coupled to numerous bioinformatics tools have led to rapid technological progress in metagenomics and metatranscriptomics to investigate complex microorganism communities. Nevertheless, a combination of different bioinformatic tools remains necessary to draw conclusions out of microbiota studies. Modular and user-friendly tools would greatly improve such studies. Findings: We therefore developed ASaiM, an Open-Source Galaxy-based framework dedicated to microbiota data analyses. ASaiM provides an extensive collection of tools to assemble, extract, explore, and visualize microbiota information from raw metataxonomic, metagenomic, or metatranscriptomic sequences. To guide the analyses, several customizable workflows are included and are supported by tutorials and Galaxy interactive tours, which guide users through the analyses step by step. ASaiM is implemented as a Galaxy Docker flavour. It is scalable to thousands of datasets but also can be used on a normal PC. The associated source code is available under Apache 2 license at https://github.com/ASaiM/framework and documentation can be found online (http://asaim.readthedocs.io). Conclusions: Based on the Galaxy framework, ASaiM offers a sophisticated environment with a variety of tools, workflows, documentation, and training to scientists working on complex microorganism communities. It makes analysis and exploration analyses of microbiota data easy, quick, transparent, reproducible, and shareable

    Comparative genomics highlights the unique biology of Methanomassiliicoccales, a Thermoplasmatales-related seventh order of methanogenic archaea that encodes pyrrolysine

    Get PDF
    BACKGROUND: A seventh order of methanogens, the Methanomassiliicoccales, has been identified in diverse anaerobic environments including the gastrointestinal tracts (GIT) of humans and other animals and may contribute significantly to methane emission and global warming. Methanomassiliicoccales are phylogenetically distant from all other orders of methanogens and belong to a large evolutionary branch composed by lineages of non-methanogenic archaea such as Thermoplasmatales, the Deep Hydrothermal Vent Euryarchaeota-2 (DHVE-2, Aciduliprofundum boonei) and the Marine Group-II (MG-II). To better understand this new order and its relationship to other archaea, we manually curated and extensively compared the genome sequences of three Methanomassiliicoccales representatives derived from human GIT microbiota, “Candidatus Methanomethylophilus alvus", “Candidatus Methanomassiliicoccus intestinalis” and Methanomassiliicoccus luminyensis. RESULTS: Comparative analyses revealed atypical features, such as the scattering of the ribosomal RNA genes in the genome and the absence of eukaryotic-like histone gene otherwise present in most of Euryarchaeota genomes. Previously identified in Thermoplasmatales genomes, these features are presently extended to several completely sequenced genomes of this large evolutionary branch, including MG-II and DHVE2. The three Methanomassiliicoccales genomes share a unique composition of genes involved in energy conservation suggesting an original combination of two main energy conservation processes previously described in other methanogens. They also display substantial differences with each other, such as their codon usage, the nature and origin of their CRISPRs systems and the genes possibly involved in particular environmental adaptations. The genome of M. luminyensis encodes several features to thrive in soil and sediment conditions suggesting its larger environmental distribution than GIT. Conversely, “Ca. M. alvus” and “Ca. M. intestinalis” do not present these features and could be more restricted and specialized on GIT. Prediction of the amber codon usage, either as a termination signal of translation or coding for pyrrolysine revealed contrasted patterns among the three genomes and suggests a different handling of the Pyl-encoding capacity. CONCLUSIONS: This study represents the first insights into the genomic organization and metabolic traits of the seventh order of methanogens. It suggests contrasted evolutionary history among the three analyzed Methanomassiliicoccales representatives and provides information on conserved characteristics among the overall methanogens and among Thermoplasmat

    PhylArray: phylogenetic probe design algorithm for microarray

    Get PDF
    International audienceMOTIVATION: Microbial diversity is still largely unknown in most environments, such as soils. In order to get access to this microbial 'black-box', the development of powerful tools such as microarrays are necessary. However, the reliability of this approach relies on probe efficiency, in particular sensitivity, specificity and explorative power, in order to obtain an image of the microbial communities that is close to reality. RESULTS: We propose a new probe design algorithm that is able to select microarray probes targeting SSU rRNA at any phylogenetic level. This original approach, implemented in a program called 'PhylArray', designs a combination of degenerate and non-degenerate probes for each target taxon. Comparative experimental evaluations indicate that probes designed with PhylArray yield a higher sensitivity and specificity than those designed by conventional approaches. Applying the combined PhyArray/GoArrays strategy helps to optimize the hybridization performance of short probes. Finally, hybridizations with environmental targets have shown that the use of the PhylArray strategy can draw attention to even previously unknown bacteria

    Functional Characteristics of a Highly Specific Integrase Encoded by an LTR-Retrotransposon

    Get PDF
    Background: The retroviral Integrase protein catalyzes the insertion of linear viral DNA into host cell DNA. Although different retroviruses have been shown to target distinctive chromosomal regions, few of them display a site-specific integration. ZAM, a retroelement from Drosophila melanogaster very similar in structure and replication cycle to mammalian retroviruses is highly site-specific. Indeed, ZAM copies target the genomic 59-CGCGCg-39 consensus-sequences. To enlighten the determinants of this high integration specificity, we investigated the functional properties of its integrase protein denoted ZAM-IN. Principal Findings: Here we show that ZAM-IN displays the property to nick DNA molecules in vitro. This endonuclease activity targets specific sequences that are present in a 388 bp fragment taken from the white locus and known to be a genomic ZAM integration site in vivo. Furthermore, ZAM-IN displays the unusual property to directly bind specific genomic DNA sequences. Two specific and independent sites are recognized within the 388 bp fragment of the white locus: the CGCGCg sequence and a closely apposed site different in sequence. Conclusion: This study strongly argues that the intrinsic properties of ZAM-IN, ie its binding properties and its endonuclease activity, play an important part in ZAM integration specificity. Its ability to select two binding sites and to nick the DNA molecule reminds the strategy used by some site-specific recombination enzymes and forms the basis for site-specifi

    GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts

    Get PDF
    Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Pro
    • 

    corecore