84 research outputs found
Statistical Analysis of Microarray Data with Replicated Spots: A Case Study with Synechococcus WH8102
Until recently microarray experiments often involved relatively few arrays with only a single representation of each gene on each array. A complete genome microarray with multiple spots per gene (spread out spatially across the array) was developed in order to compare the gene expression of a marine cyanobacterium and a knockout mutant strain in a defined artificial seawater medium. Statistical methods were developed for analysis in the special situation of this case study where there is gene replication within an array and where relatively few arrays are used, which can be the case with current array technology. Due in part to the replication within an array, it was possible to detect very small changes in the levels of expression between the wild type and mutant strains. One interesting biological outcome of this experiment is the indication of the extent to which the phosphorus regulatory system of this cyanobacterium affects the expression of multiple genes beyond those strictly involved in phosphorus acquisition
NCBI GEO: archive for high-throughput functional genomic data
The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest public repository for high-throughput gene expression data. Additionally, GEO hosts other categories of high-throughput functional genomic data, including those that examine genome copy number variations, chromatin structure, methylation status and transcription factor binding. These data are generated by the research community using high-throughput technologies like microarrays and, more recently, next-generation sequencing. The database has a flexible infrastructure that can capture fully annotated raw and processed data, enabling compliance with major community-derived scientific reporting standards such as âMinimum Information About a Microarray Experimentâ (MIAME). In addition to serving as a centralized data storage hub, GEO offers many tools and features that allow users to effectively explore, analyze and download expression data from both gene-centric and experiment-centric perspectives. This article summarizes the GEO repository structure, content and operating procedures, as well as recently introduced data mining features. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/
A contiguous de novo genome assembly of sugar beet EL10 (Beta vulgaris L.)
A contiguous assembly of the inbred âEL10â sugar beet (Beta vulgaris ssp. vulgaris) genome was constructed using PacBio long-read sequencing, BioNano optical mapping, Hi-C scaffolding, and Illumina short-read error correction. The EL10.1 assembly was 540 Mb, of which 96.2% was contained in nine chromosome-sized pseudomolecules with lengths from 52 to 65 Mb, and 31 contigs with a median size of 282 kb that remained unassembled. Gene annotation incorporating RNA-seq data and curated sequences via the MAKER annotation pipeline generated 24,255 gene models. Results indicated that the EL10.1 genome assembly is a contiguous genome assembly highly congruent with the published sugar beet reference genome. Gross duplicate gene analyses of EL10.1 revealed little large-scale intra-genome duplication. Reduced gene copy number for well-annotated gene families relative to other core eudicots was observed, especially for transcription factors. Variation in genome size in B. vulgaris was investigated by flow cytometry among 50 individuals producing estimates from 633 to 875 Mb/1C. Read-depth mapping with short-read whole-genome sequences from other sugar beet germplasm suggested that relatively few regions of the sugar beet genome appeared associated with high-copy number variation
NCBI GEO: archive for functional genomics data setsâ10 years on
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20â000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/
Interactive metagenomic visualization in a Web browser
<p>Abstract</p> <p>Background</p> <p>A critical output of metagenomic studies is the estimation of abundances of taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important to consider both their hierarchical contexts and their prediction confidence. The current tools for visualizing metagenomic data, however, omit or distort quantitative hierarchical relationships and lack the facility for displaying secondary variables.</p> <p>Results</p> <p>Here we present Krona, a new visualization tool that allows intuitive exploration of relative abundances and confidences within the complex hierarchies of metagenomic classifications. Krona combines a variant of radial, space-filling displays with parametric coloring and interactive polar-coordinate zooming. The HTML5 and JavaScript implementation enables fully interactive charts that can be explored with any modern Web browser, without the need for installed software or plug-ins. This Web-based architecture also allows each chart to be an independent document, making them easy to share via e-mail or post to a standard Web server. To illustrate Krona's utility, we describe its application to various metagenomic data sets and its compatibility with popular metagenomic analysis tools.</p> <p>Conclusions</p> <p>Krona is both a powerful metagenomic visualization tool and a demonstration of the potential of HTML5 for highly accessible bioinformatic visualizations. Its rich and interactive displays facilitate more informed interpretations of metagenomic analyses, while its implementation as a browser-based application makes it extremely portable and easily adopted into existing analysis packages. Both the Krona rendering code and conversion tools are freely available under a BSD open-source license, and available from: <url>http://krona.sourceforge.net</url>.</p
Genetic variation for tuber mineral concentrations in accessions of the Commonwealth Potato Collection
The variation in tuber mineral concentrations amongst accessions of wild tuber-bearing Solanum species in the Commonwealth Potato Collection (CPC) was evaluated under greenhouse conditions. Selected CPC accessions, representing the eco-geographical distribution of wild potatoes, were grown to maturity in peat-based compost under controlled conditions. Tubers from five plants of each accession were harvested, bulked and their mineral composition analysed. Among the germplasm investigated, there was a greater range in tuber concentrations of some elements of nutritional significance to both plants and animals, such as (Ca, Fe and Zn; 6.7, 3.6, and 4.5-fold respectively) than others, such as (K, P and S; all <3-fold). Significant positive correlations were found between mean altitude of the species' range and tuber P, K, Cu and Mg concentrations. The amount of diversity observed in the CPC collection indicates the existence of wide differences in tuber mineral accumulation among different potato accessions. This might be useful in breeding for nutritional improvement of potato tubers
The Complete Genome Sequence of Escherichia coli EC958: A High Quality Reference Sequence for the Globally Disseminated Multidrug Resistant E. coli O25b:H4-ST131 Clone
Escherichia coli ST131 is now recognised as a leading contributor to urinary tract and bloodstream infections in both community and clinical settings. Here we present the complete, annotated genome of E. coli EC958, which was isolated from the urine of a patient presenting with a urinary tract infection in the Northwest region of England and represents the most well characterised ST131 strain. Sequencing was carried out using the Pacific Biosciences platform, which provided sufficient depth and read-length to produce a complete genome without the need for other technologies. The discovery of spurious contigs within the assembly that correspond to site-specific inversions in the tail fibre regions of prophages demonstrates the potential for this technology to reveal dynamic evolutionary mechanisms. E. coli EC958 belongs to the major subgroup of ST131 strains that produce the CTX-M-15 extended spectrum β-lactamase, are fluoroquinolone resistant and encode the fimH30 type 1 fimbrial adhesin. This subgroup includes the Indian strain NA114 and the North American strain JJ1886. A comparison of the genomes of EC958, JJ1886 and NA114 revealed that differences in the arrangement of genomic islands, prophages and other repetitive elements in the NA114 genome are not biologically relevant and are due to misassembly. The availability of a high quality uropathogenic E. coli ST131 genome provides a reference for understanding this multidrug resistant pathogen and will facilitate novel functional, comparative and clinical studies of the E. coli ST131 clonal lineage
Draft Genome Sequences from a Novel Clade of <i>Bacillus cereus Sensu Lato </i>Strains, Isolated from the International Space Station
The draft genome sequences of six Bacillus strains, isolated from the International Space Station and belonging to the Bacillus anthracis-B. cereus-B. thuringiensis group, are presented here. These strains were isolated from the Japanese Experiment Module (one strain), U.S. Harmony Node 2 (three strains), and Russian Segment Zvezda Module (two strains)
Comparative genomics of the bacterial genus Listeria: Genome evolution is characterized by limited gene acquisition and limited gene loss
<p>Abstract</p> <p>Background</p> <p>The bacterial genus <it>Listeria </it>contains pathogenic and non-pathogenic species, including the pathogens <it>L. monocytogenes </it>and <it>L. ivanovii</it>, both of which carry homologous virulence gene clusters such as the <it>prfA </it>cluster and clusters of internalin genes. Initial evidence for multiple deletions of the <it>prfA </it>cluster during the evolution of <it>Listeria </it>indicates that this genus provides an interesting model for studying the evolution of virulence and also presents practical challenges with regard to definition of pathogenic strains.</p> <p>Results</p> <p>To better understand genome evolution and evolution of virulence characteristics in <it>Listeria</it>, we used a next generation sequencing approach to generate draft genomes for seven strains representing <it>Listeria </it>species or clades for which genome sequences were not available. Comparative analyses of these draft genomes and six publicly available genomes, which together represent the main <it>Listeria </it>species, showed evidence for (i) a pangenome with 2,032 core and 2,918 accessory genes identified to date, (ii) a critical role of gene loss events in transition of <it>Listeria </it>species from facultative pathogen to saprotroph, even though a consistent pattern of gene loss seemed to be absent, and a number of isolates representing non-pathogenic species still carried some virulence associated genes, and (iii) divergence of modern pathogenic and non-pathogenic <it>Listeria </it>species and strains, most likely circa 47 million years ago, from a pathogenic common ancestor that contained key virulence genes.</p> <p>Conclusions</p> <p>Genome evolution in <it>Listeria </it>involved limited gene loss and acquisition as supported by (i) a relatively high coverage of the predicted pan-genome by the observed pan-genome, (ii) conserved genome size (between 2.8 and 3.2 Mb), and (iii) a highly syntenic genome. Limited gene loss in <it>Listeria </it>did include loss of virulence associated genes, likely associated with multiple transitions to a saprotrophic lifestyle. The genus <it>Listeria </it>thus provides an example of a group of bacteria that appears to evolve through a loss of virulence rather than acquisition of virulence characteristics. While <it>Listeria </it>includes a number of species-like clades, many of these putative species include clades or strains with atypical virulence associated characteristics. This information will allow for the development of genetic and genomic criteria for pathogenic strains, including development of assays that specifically detect pathogenic <it>Listeria </it>strains.</p
- âŚ