Search CORE

19 research outputs found

Array2BIO: from microarray expression data to functional annotation of co-regulated genes

Author: Chain Patrick SG
Garcia Emilio
Loots Gabriela G
Mabery Shalini
Ovcharenko Ivan
Rasley Amy
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: There are several isolated tools for partial analysis of microarray expression data. To provide an integrative, easy-to-use and automated toolkit for the analysis of Affymetrix microarray expression data we have developed Array2BIO, an application that couples several analytical methods into a single web based utility. RESULTS: Array2BIO converts raw intensities into probe expression values, automatically maps those to genes, and subsequently identifies groups of co-expressed genes using two complementary approaches: (1) comparative analysis of signal versus control and (2) clustering analysis of gene expression across different conditions. The identified genes are assigned to functional categories based on Gene Ontology classification and KEGG protein interaction pathways. Array2BIO reliably handles low-expressor genes and provides a set of statistical methods for quantifying expression levels, including Benjamini-Hochberg and Bonferroni multiple testing corrections. An automated interface with the ECR Browser provides evolutionary conservation analysis for the identified gene loci while the interconnection with Crème allows prediction of gene regulatory elements that underlie observed expression patterns. CONCLUSION: We have developed Array2BIO – a web based tool for rapid comprehensive analysis of Affymetrix microarray expression data, which also allows users to link expression data to Dcode.org comparative genomics tools and integrates a system for translating co-expression data into mechanisms of gene co-regulation. Array2BIO is publicly available a

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Artificial Polyploidy Improves Bacterial Single Cell Genome Recovery

Author: A Raghunathan
Armand E. K. Dichosa
C Spits
Chien-Chi Lo
Cliff S. Han
DH Buckley
DJ Haydon
DJ Sherratt
DN Margalit
DR Zerbino
FB Dean
FB Dean
GW Tyson
H Ito
H Li
H Li
HP Erickson
J Fox
J Wang
J. Chris Detter
Jeremy P. Snook
Kim McMurry
KV Zhang
L Amaral
Lance D. Green
Lara G. Preteska
Lea L. Weston
M Podar
MH Garcia
Michael S. Fitzsimons
P Domadia
Patrick S. Chain
Paul Jaak Janssen
PG Eckburg
Q Huang
R Jaiswal
R Rozen
R Stepanauskas
RI Amman
RS Lasken
S Hosono
S Kurtz
S Rodrigue
S Urgaonkar
SG Tringe
T Kaeberlein
T Läppchen
T Woyke
T Woyke
TK Beuria
W Margolin
Wei Gu
X Pan
Xiaojing Zhang
Y Marcy
Y Ohashi
Publication venue: Public Library of Science
Publication date: 22/05/2012
Field of study

BACKGROUND: Single cell genomics (SCG) is a combination of methods whose goal is to decipher the complete genomic sequence from a single cell and has been applied mostly to organisms with smaller genomes, such as bacteria and archaea. Prior single cell studies showed that a significant portion of a genome could be obtained. However, breakages of genomic DNA and amplification bias have made it very challenging to acquire a complete genome with single cells. We investigated an artificial method to induce polyploidy in Bacillus subtilis ATCC 6633 by blocking cell division and have shown that we can significantly improve the performance of genomic sequencing from a single cell. METHODOLOGY/PRINCIPAL FINDINGS: We inhibited the bacterial cytoskeleton protein FtsZ in B.subtilis with an FtsZ-inhibiting compound, PC190723, resulting in larger undivided single cells with multiple copies of its genome. qPCR assays of these larger, sorted cells showed higher DNA content, have less amplification bias, and greater genomic recovery than untreated cells. SIGNIFICANCE: The method presented here shows the potential to obtain a nearly complete genome sequence from a single bacterial cell. With millions of uncultured bacterial species in nature, this method holds tremendous promise to provide insight into the genomic novelty of yet-to-be discovered species, and given the temporary effects of artificial polyploidy coupled with the ability to sort and distinguish differences in cell size and genomic DNA content, may allow recovery of specific organisms in addition to their genomes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Recommended from our members

Identification of mobile genetic elements with geNomad

Author: Babinski Michal
Camargo Antonio Pedro
Chain Patrick SG
Hu Bin
Kyrpides Nikos C
Nayfach Stephen
Roux Simon
Schulz Frederik
Xu Yan
Publication venue: eScholarship, University of California
Publication date: 21/09/2023
Field of study

Identifying and characterizing mobile genetic elements in sequencing data is essential for understanding their diversity, ecology, biotechnological applications and impact on public health. Here we introduce geNomad, a classification and annotation framework that combines information from gene content and a deep neural network to identify sequences of plasmids and viruses. geNomad uses a dataset of more than 200,000 marker protein profiles to provide functional gene annotation and taxonomic assignment of viral genomes. Using a conditional random field model, geNomad also detects proviruses integrated into host genomes with high precision. In benchmarks, geNomad achieved high classification performance for diverse plasmids and viruses (Matthews correlation coefficient of 77.8% and 95.3%, respectively), substantially outperforming other tools. Leveraging geNomad's speed and scalability, we processed over 2.7 trillion base pairs of sequencing data, leading to the discovery of millions of viruses and plasmids that are available through the IMG/VR and IMG/PR databases. geNomad is available at https://portal.nersc.gov/genomad

eScholarship - University of California

Recommended from our members

Discovery of an Antarctic Ascidian-Associated Uncultivated Verrucomicrobia with Antimelanoma Palmerolide Biosynthetic Potential

Author: Avalon Nicole E
Baker Bill J
Chain Patrick SG
Daligault Hajnalka E
Davenport Karen W
Dichosa Armand EK
Higham Mary L
Kunde Yuliya
Lo Chien-Chi
Murray Alison E
Read Robert W
Publication venue: eScholarship, University of California
Publication date: 22/12/2021
Field of study

The Antarctic marine ecosystem harbors a wealth of biological and chemical innovation that has risen in concert over millennia since the isolation of the continent and formation of the Antarctic circumpolar current. Scientific inquiry into the novelty of marine natural products produced by Antarctic benthic invertebrates led to the discovery of a bioactive macrolide, palmerolide A, that has specific activity against melanoma and holds considerable promise as an anticancer therapeutic. While this compound was isolated from the Antarctic ascidian Synoicum adareanum, its biosynthesis has since been hypothesized to be microbially mediated, given structural similarities to microbially produced hybrid nonribosomal peptide-polyketide macrolides. Here, we describe a metagenome-enabled investigation aimed at identifying the biosynthetic gene cluster (BGC) and palmerolide A-producing organism. A 74-kbp candidate BGC encoding the multimodular enzymatic machinery (hybrid type I-trans-AT polyketide synthase-nonribosomal peptide synthetase and tailoring functional domains) was identified and found to harbor key features predicted as necessary for palmerolide A biosynthesis. Surveys of ascidian microbiome samples targeting the candidate BGC revealed a high correlation between palmerolide gene targets and a single 16S rRNA gene variant (R = 0.83 to 0.99). Through repeated rounds of metagenome sequencing followed by binning contigs into metagenome-assembled genomes, we were able to retrieve a nearly complete genome (10 contigs) of the BGC-producing organism, a novel verrucomicrobium within the Opitutaceae family that we propose here as "Candidatus Synoicihabitans palmerolidicus." The refined genome assembly harbors five highly similar BGC copies, along with structural and functional features that shed light on the host-associated nature of this unique bacterium. IMPORTANCE Palmerolide A has potential as a chemotherapeutic agent to target melanoma. We interrogated the microbiome of the Antarctic ascidian, Synoicum adareanum, using a cultivation-independent high-throughput sequencing and bioinformatic strategy. The metagenome-encoded biosynthetic machinery predicted to produce palmerolide A was found to be associated with the genome of a member of the S. adareanum core microbiome. Phylogenomic analysis suggests the organism represents a new deeply branching genus, "Candidatus Synoicihabitans palmerolidicus," in the Opitutaceae family of the Verrucomicrobia phylum. The Ca. Synoicihabitans palmerolidicus 4.29-Mb genome encodes a repertoire of carbohydrate-utilizing and transport pathways, a chemotaxis system, flagellar biosynthetic capacity, and other regulatory elements enabling its ascidian-associated lifestyle. The palmerolide producer's genome also contains five distinct copies of the large palmerolide biosynthetic gene cluster that may provide structural complexity of palmerolide variants

eScholarship - University of California

Recommended from our members

Comparative metagenomics reveals impact of contaminants on groundwater microbiomes.

Author: Arkin Adam P
Chain Patrick SG
Deng Ye
Fields Matthew W
Gao Weimin
Hazen Terry C
He Zhili
Hemme Christopher L
Nostrand Joy D Van
Qin Yujia
Rubin Edward M
Shi Zhou
Tiedje James M
Tringe Susannah G
Tu Qichao
Wu Liyou
Zhou Jizhong
Publication venue: eScholarship, University of California
Publication date: 01/01/2015
Field of study

To understand patterns of geochemical cycling in pristine versus contaminated groundwater ecosystems, pristine shallow groundwater (FW301) and contaminated groundwater (FW106) samples from the Oak Ridge Integrated Field Research Center (OR-IFRC) were sequenced and compared to each other to determine phylogenetic and metabolic difference between the communities. Proteobacteria (e.g., Burkholderia, Pseudomonas) are the most abundant lineages in the pristine community, though a significant proportion ( >55%) of the community is composed of poorly characterized low abundance (individually <1%) lineages. The phylogenetic diversity of the pristine community contributed to a broader diversity of metabolic networks than the contaminated community. In addition, the pristine community encodes redundant and mostly complete geochemical cycles distributed over multiple lineages and appears capable of a wide range of metabolic activities. In contrast, many geochemical cycles in the contaminated community appear truncated or minimized due to decreased biodiversity and dominance by Rhodanobacter populations capable of surviving the combination of stresses at the site. These results indicate that the pristine site contains more robust and encodes more functional redundancy than the stressed community, which contributes to more efficient nutrient cycling and adaptability than the stressed community

eScholarship - University of California

Recommended from our members

A Practical Approach to Using the Genomic Standards Consortium MIxS Reporting Standard for Comparative Genomics and Metagenomics

Author: Chain Patrick SG
Eloe-Fadrosh Emiley A
Harris Nomi L
Hu Bin
Hunter Christopher I
Johnson Leah YD
Kelliher Julia M
McCue Lee Ann
McHardy Alice Carolyn
Miller Mark Andrew
Mukherjee Supratim
Mungall Christopher J
Patil Sujay Sanjeev
Reddy TBK
Rodriguez Francisca E
Schriml Lynn M
Smith Montana
Thornton Michael B
Walls Ramona
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example

eScholarship - University of California

Challenges in Bioinformatics Workflows for Processing Microbiome Omics Data at Scale

Author: Anubhav
Babinski Michal
Canon Shane
Chain Patrick SG
Corilo Yuri
Davenport Karen
Duncan William D
Eloe-Fadrosh Emiley A
Fagnan Kjiersten
Flynn Mark
Foster Brian
Hays David
Hu Bin
Huntemann Marcel
Jackson Elais K Player
Kelliher Julia
Li Po-E
Lo Chien-Chi
Mans Douglas
McCue Lee Ann
Mouncey Nigel
Mungall Christopher J
Piehowski Paul D
Purvine Samuel O
Smith Montana
Varghese Neha Jacob
Winston Donald
Xu Yan
Publication venue: eScholarship, University of California
Publication date: 01/01/2021
Field of study

The nascent field of microbiome science is transitioning from a descriptive approach of cataloging taxa and functions present in an environment to applying multi-omics methods to investigate microbiome dynamics and function. A large number of new tools and algorithms have been designed and used for very specific purposes on samples collected by individual investigators or groups. While these developments have been quite instructive, the ability to compare microbiome data generated by many groups of researchers is impeded by the lack of standardized application of bioinformatics methods. Additionally, there are few examples of broad bioinformatics workflows that can process metagenome, metatranscriptome, metaproteome and metabolomic data at scale, and no central hub that allows processing, or provides varied omics data that are findable, accessible, interoperable and reusable (FAIR). Here, we review some of the challenges that exist in analyzing omics data within the microbiome research sphere, and provide context on how the National Microbiome Data Collaborative has adopted a standardized and open access approach to address such challenges

PubMed Central

eScholarship - University of California

Recommended from our members

Standardized and accessible multi-omics bioinformatics workflows through the NMDC EDGE resource

Author: Babinski Michal
Canon Shane
Cavanna Eric
Chain Patrick SG
Cholia Shreyas
Clum Alicia
Corilo Yuri E
Eloe-Fadrosh Emiley A
Flynn Mark C
Fujimoto Grant
Giberson Cameron
Hu Bin
Johnson Leah YD
Kelliher Julia M
Li Kaitlyn J
Li Po-E
Li Valerie
Lo Chien-Chi
Lynch Wendi
McCue Lee Ann
Mungall Chris
Piehowski Paul
Prime Kaelan
Purvine Samuel
Rodriguez Francisca
Roux Simon
Sarrafan Setareh
Shakya Migun
Smith Montana
Xu Yan
Publication venue: eScholarship, University of California
Publication date: 01/12/2024
Field of study

Accessible and easy-to-use standardized bioinformatics workflows are necessary to advance microbiome research from observational studies to large-scale, data-driven approaches. Standardized multi-omics data enables comparative studies, data reuse, and applications of machine learning to model biological processes. To advance broad accessibility of standardized multi-omics bioinformatics workflows, the National Microbiome Data Collaborative (NMDC) has developed the Empowering the Development of Genomics Expertise (NMDC EDGE) resource, a user-friendly, open-source web application (https://nmdc-edge.org). Here, we describe the design and main functionality of the NMDC EDGE resource for processing metagenome, metatranscriptome, natural organic matter, and metaproteome data. The architecture relies on three main layers (web application, orchestration, and execution) to ensure flexibility and expansion to future workflows. The orchestration and execution layers leverage best practices in software containers and accommodate high-performance computing and cloud computing services. Further, we have adopted a robust user research process to collect feedback for continuous improvement of the resource. NMDC EDGE provides an accessible interface for researchers to process multi-omics microbiome data using production-quality workflows to facilitate improved data standardization and interoperability

eScholarship - University of California