108 research outputs found

    CompaGB: An open framework for genome browsers comparison

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tools to visualize and explore genomes hold a central place in genomics and the diversity of genome browsers has increased dramatically over the last few years. It often turns out to be a daunting task to compare and choose a well-adapted genome browser, as multidisciplinary knowledge is required to carry out this task and the number of tools, functionalities and features are overwhelming.</p> <p>Findings</p> <p>To assist in this task, we propose a community-based framework based on two cornerstones: (i) the implementation of industry promoted software qualification method (QSOS) adapted for genome browser evaluations, and (ii) a web resource providing numerous facilities either for visualizing comparisons or performing new evaluations. We formulated 60 criteria specifically for genome browsers, and incorporated another 65 directly from QSOS's generic section. Those criteria aim to answer versatile needs, ranging from a biologist whose interest primarily lies into user-friendly and informative functionalities, a bioinformatician who wants to integrate the genome browser into a wider framework, or a computer scientist who might choose a software according to more technical features. We developed a dedicated web application to enrich the existing QSOS functionalities (weighting of criteria, user profile) with features of interest to a community-based framework: easy management of evolving data, user comments...</p> <p>Conclusions</p> <p>The framework is available at <url>http://genome.jouy.inra.fr/CompaGB</url>. It is open to anyone who wishes to participate in the evaluations. It helps the scientific community to (1) choose a genome browser that would better fit their particular project, (2) visualize features comparatively with easily accessible formats, such as tables or radar plots and (3) perform their own evaluation against the defined criteria. To illustrate the CompaGB functionalities, we have evaluated seven genome browsers according to the implemented methodology. A summary of the features of the compared genome browsers is presented and discussed.</p

    Identification of DNA Motifs Implicated in Maintenance of Bacterial Core Genomes by Predictive Modeling

    Get PDF
    Bacterial biodiversity at the species level, in terms of gene acquisition or loss, is so immense that it raises the question of how essential chromosomal regions are spared from uncontrolled rearrangements. Protection of the genome likely depends on specific DNA motifs that impose limits on the regions that undergo recombination. Although most such motifs remain unidentified, they are theoretically predictable based on their genomic distribution properties. We examined the distribution of the “crossover hotspot instigator,” or Chi, in Escherichia coli, and found that its exceptional distribution is restricted to the core genome common to three strains. We then formulated a set of criteria that were incorporated in a statistical model to search core genomes for motifs potentially involved in genome stability in other species. Our strategy led us to identify and biologically validate two distinct heptamers that possess Chi properties, one in Staphylococcus aureus, and the other in several streptococci. This strategy paves the way for wide-scale discovery of other important functional noncoding motifs that distinguish core genomes from the strain-variable regions

    Amplification biases: possible differences among deviating gene expressions.

    Get PDF
    International audienceBACKGROUND: Gene expression profiling has become a tool of choice to study pathological or developmental questions but in most cases the material is scarce and requires sample amplification. Two main procedures have been used: in vitro transcription (IVT) and polymerase chain reaction (PCR), the former known as linear and the latter as exponential. Previous reports identified enzymatic pitfalls in PCR and IVT protocols; however the possible differences between the sequences affected by these amplification defaults were only rarely explored. RESULTS: Screening a bovine cDNA array dedicated to embryonic stages with embryonic (n = 3) and somatic tissues (n = 2), we proceeded to moderate amplifications starting from 1 mug of total RNA (global PCR or IVT one round). Whatever the tissue, 16% of the probes were involved in deviating gene expressions due to amplification defaults. These distortions were likely due to the molecular features of the affected sequences (position within a gene, GC content, hairpin number) but also to the relative abundance of these transcripts within the tissues. These deviating genes mainly encoded housekeeping genes from physiological or cellular processes (70%) and constituted 2 subsets which did not overlap (molecular features, signal intensities, gene ID). However, the differential expressions identified between embryonic stages were both reliable (minor intersect with biased expressions) and relevant (biologically validated). In addition, the relative expression levels of those genes were biologically similar between amplified and unamplified samples. CONCLUSION: Conversely to the most recent reports which challenged the use of intense amplification procedures on minute amounts of RNA, we chose moderate PCR and IVT amplifications for our gene profiling study. Conclusively, it appeared that systematic biases arose even with moderate amplification procedures, independently of (i) the sample used: brain, ovary or embryos, (ii) the enzymatic properties initially inferred (exponential or linear) and (iii) the preliminary optimization of the protocols. Moreover the use of an in-house developed array, small-sized but well suited to the tissues we worked with, was of real interest for the search of differential expressions

    Expressed sequences tags of the anther smut fungus, Microbotryum violaceum, identify mating and pathogenicity genes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The basidiomycete fungus <it>Microbotryum violaceum </it>is responsible for the anther-smut disease in many plants of the Caryophyllaceae family and is a model in genetics and evolutionary biology. Infection is initiated by dikaryotic hyphae produced after the conjugation of two haploid sporidia of opposite mating type. This study describes <it>M. violaceum </it>ESTs corresponding to nuclear genes expressed during conjugation and early hyphal production.</p> <p>Results</p> <p>A normalized cDNA library generated 24,128 sequences, which were assembled into 7,765 unique genes; 25.2% of them displayed significant similarity to annotated proteins from other organisms, 74.3% a weak similarity to the same set of known proteins, and 0.5% were orphans. We identified putative pheromone receptors and genes that in other fungi are involved in the mating process. We also identified many sequences similar to genes known to be involved in pathogenicity in other fungi. The <it>M. violaceum </it>EST database, MICROBASE, is available on the Web and provides access to the sequences, assembled contigs, annotations and programs to compare similarities against MICROBASE.</p> <p>Conclusion</p> <p>This study provides a basis for cloning the mating type locus, for further investigation of pathogenicity genes in the anther smut fungi, and for comparative genomics.</p

    Deciphering the Molecular Basis of Wine Yeast Fermentation Traits Using a Combined Genetic and Genomic Approach

    Get PDF
    The genetic basis of the phenotypic diversity of yeast is still poorly understood. Wine yeast strains have specific abilities to grow and ferment under stressful conditions compared with other strains, but the genetic basis underlying these traits is unknown. Understanding how sequence variation influences such phenotypes is a major challenge to address adaptation mechanisms of wine yeast. We aimed to identify the genetic basis of fermentation traits and gain insight into their relationships with variations in gene expression among yeast strains. We combined fermentation trait QTL mapping and expression profiling of fermenting cells in a segregating population from a cross between a wine yeast derivative and a laboratory strain. We report the identification of QTL for various fermentation traits (fermentation rates, nitrogen utilization, metabolites production) as well as expression QTL (eQTL). We found that many transcripts mapped to several eQTL hotspots and that two of them overlapped with QTL for fermentation traits. A QTL controlling the maximal fermentation rate and nitrogen utilization overlapping with an eQTL hotspot was dissected. We functionally demonstrated that an allele of the ABZ1 gene, localized in the hotspot and involved in p-aminobenzoate biosynthesis, controls the fermentation rate through modulation of nitrogen utilization. Our data suggest that the laboratory strain harbors a defective ABZ1 allele, which triggers strong metabolic and physiological alterations responsible for the generation of the eQTL hotspot. They also suggest that a number of gene expression differences result from some alleles that trigger major physiological disturbances

    MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level

    Get PDF
    BACKGROUND: The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. DESCRIPTION: Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. CONCLUSION: The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic

    A New Integrative and Mobilizable Element Is a Major Contributor to Tetracycline Resistance in Streptococcus dysgalactiae subsp. equisimilis

    Full text link
    Tetracycline resistance in streptococci is mainly due to ribosomal protection mediated by the tet(M) gene that is usually located in the integrative and conjugative elements (ICEs) of the Tn916-family. In this study, we analyzed the genes involved in tetracycline resistance and the associated mobile genetic elements (MGEs) in Streptococcus dysgalactiae subsp. equisimilis (SDSE) causing invasive disease. SDSE resistant to tetracycline collected from 2012 to 2019 in a single hospital and from 2018 in three other hospitals were analyzed by whole genome sequencing. Out of a total of 84 SDSE isolates, 24 (28.5%) were resistant to tetracycline due to the presence of tet(M) (n = 22), tet(W) (n = 1), or tet(L) plus tet(W) (n = 1). The tet(M) genes were found in the ICEs of the Tn916-family (n = 10) and in a new integrative and mobilizable element (IME; n = 12). Phylogenetic analysis showed a higher genetic diversity among the strains carrying Tn916 than those having the new IME, which were closely related, and all belonged to CC15. In conclusion, tetracycline resistance in SDSE is mostly due to the tet(M) gene associated with ICEs belonging to the Tn916-family and a new IME. This new IME is a major cause of tetracycline resistance in invasive Streptococcus dysgalactiae subsp. equisimilis in our settings

    Organised Genome Dynamics in the Escherichia coli Species Results in Highly Diverse Adaptive Paths

    Get PDF
    The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the ∌18,000 families of orthologous genes, we found ∌2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome
    • 

    corecore