Search CORE

Warwick Research Archives Portal Repository

Targeted Discovery of Glycoside Hydrolases from a Switchgrass-Adapted Compost Community

Author: Amitha Reddy
Blake A. Simmons
Francisco Rodriguez-Valera
Jean S. VanderGheynst
Joshua I. Park
Martin Allgaier
Natalia Ivanova
Patrik D'haeseleer
Philip Hugenholtz
Rajat Sapra
Steve Lowry
Terry C. Hazen
Publication venue: Public Library of Science
Publication date: 21/01/2010
Field of study

Development of cellulosic biofuels from non-food crops is currently an area of intense research interest. Tailoring depolymerizing enzymes to particular feedstocks and pretreatment conditions is one promising avenue of research in this area. Here we added a green-waste compost inoculum to switchgrass (Panicum virgatum) and simulated thermophilic composting in a bioreactor to select for a switchgrass-adapted community and to facilitate targeted discovery of glycoside hydrolases. Small-subunit (SSU) rRNA-based community profiles revealed that the microbial community changed dramatically between the initial and switchgrass-adapted compost (SAC) with some bacterial populations being enriched over 20-fold. We obtained 225 Mbp of 454-titanium pyrosequence data from the SAC community and conservatively identified 800 genes encoding glycoside hydrolase domains that were biased toward depolymerizing grass cell wall components. Of these, ∼10% were putative cellulases mostly belonging to families GH5 and GH9. We synthesized two SAC GH9 genes with codon optimization for heterologous expression in Escherichia coli and observed activity for one on carboxymethyl cellulose. The active GH9 enzyme has a temperature optimum of 50°C and pH range of 5.5 to 8 consistent with the composting conditions applied. We demonstrate that microbial communities adapt to switchgrass decomposition using simulated composting condition and that full-length genes can be identified from complex metagenomic sequence data, synthesized and expressed resulting in active enzyme

Public Library of Science (PLOS)

University of Queensland eSpace

UNT Digital Library

RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

Author: A. A. Mironov
A. E. Kazakov
A. P. Arkin
Alkema
Baumbach
D'haeseleer
D. A. Rodionov
E. D. Stavrovskaya
E. S. Novichkova
Fredrickson
Gelfand
Gelfand
I. Dubchak
M. S. Gelfand
Manson McGuire
McCue
Overbeek
P. S. Novichkov
Price
Rodionov
Rodionov
Rodionov
Rodionov
Rodionov
Tan
Publication venue: Oxford University Press
Publication date: 26/05/2010
Field of study

RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov

UNT Digital Library

Reverse engineering gene regulatory network from microarray data using linear time-variant model

Author: B Perrin
D Cho
D Marbach
D Tominaga
D Weaver
F Streichert
Hitoshi Iba
J Brest
J Brest
J Kim
J Proakis
L Shapiro
M Bansal
M Savageau
M Zou
Mitra Kabir
N Friedman
N Noman
Nasimul Noman
P D'haeseleer
R Cho
R Storn
R Storn
S Kauffman
S Kikuchi
S Kim
S Kimura
S Kimura
S Kimura
S Liang
T Akutsu
T Gardner
T Rogalsky
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Proteomic Profiling of Burkholderia thailandensis During Host Infection Using Bio-Orthogonal Noncanonical Amino Acid Tagging (BONCAT)

Author: Brent W. Segelke
Magdalena Franco
Megan J. Liou
Patrik M. D'haeseleer
Sahar H. El-Etr
Steven S. Branda
Yasmeen Haider
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Burkholderia pseudomallei and B. mallei are the causative agents of melioidosis and glanders, respectively, and are often fatal to humans and animals. Owing to the high fatality rate, potential for spread by aerosolization, and the lack of efficacious therapeutics, B. pseudomallei and B. mallei are considered biothreat agents of concern. In this study, we investigate the proteome of Burkholderia thailandensis, a closely related surrogate for the two more virulent Burkholderia species, during infection of host cells, and compare to that of B. thailandensis in culture. Studying the proteome of Burkholderia spp. during infection is expected to reveal molecular mechanisms of intracellular survival and host immune evasion; but proteomic profiling of Burkholderia during host infection is challenging. Proteomic analyses of host-associated bacteria are typically hindered by the overwhelming host protein content recovered from infected cultures. To address this problem, we have applied bio-orthogonal noncanonical amino acid tagging (BONCAT) to B. thailandensis, enabling the enrichment of newly expressed bacterial proteins from virtually any growth condition, including host cell infection. In this study, we show that B. thailandensis proteins were selectively labeled and efficiently enriched from infected host cells using BONCAT. We also demonstrate that this method can be used to label bacteria in situ by fluorescent tagging. Finally, we present a global proteomic profile of B. thailandensis as it infects host cells and a list of proteins that are differentially regulated in infection conditions as compared to bacterial monoculture. Among the identified proteins are quorum sensing regulated genes as well as homologs to previously identified virulence factors. This method provides a powerful tool to study the molecular processes during Burkholderia infection, a much-needed addition to the Burkholderia molecular toolbox

Frontiers - Publisher Connector

Speeding up the Consensus Clustering methodology for microarray data analysis

Author: A Ben-Hur
A Bertoni
A Bertoni
A Borodin
A Jain
AK Jain
B Everitt
B Mirkin
E Levine
Filippo Utro
G Frahling
G Milligan
J Handl
J Kraus
JA Hartigan
JA Rice
JP Brunet
K Devarajan
K Yeung
L Kaufman
P Bertrand
P D'haeseleer
P Hansen
R Giancarlo
R Shamir
R Tibshirani
Raffaele Giancarlo
S Dudoit
S Dudoit
S Klie
S Monti
S Salvador
S Seal
T Hastie
TP Speed
V Di Gesú
V Roth
W Krzanowski
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The inference of the number of clusters in a dataset, a fundamental problem in Statistics, Data Analysis and Classification, is usually addressed via internal validation measures. The stated problem is quite difficult, in particular for microarrays, since the inferred prediction must be sensible enough to capture the inherent biological structure in a dataset, e.g., functionally related genes. Despite the rich literature present in that area, the identification of an internal validation measure that is both fast and precise has proved to be elusive. In order to partially fill this gap, we propose a speed-up of <monospace>Consensus</monospace> (Consensus Clustering), a methodology whose purpose is the provision of a prediction of the number of clusters in a dataset, together with a dissimilarity matrix (the consensus matrix) that can be used by clustering algorithms. As detailed in the remainder of the paper, <monospace>Consensus</monospace> is a natural candidate for a speed-up. Results Since the time-precision performance of <monospace>Consensus</monospace> depends on two parameters, our first task is to show that a simple adjustment of the parameters is not enough to obtain a good precision-time trade-off. Our second task is to provide a fast approximation algorithm for <monospace>Consensus</monospace>. That is, the closely related algorithm <monospace>FC</monospace> (Fast Consensus) that would have the same precision as <monospace>Consensus</monospace> with a substantially better time performance. The performance of <monospace>FC</monospace> has been assessed via extensive experiments on twelve benchmark datasets that summarize key features of microarray applications, such as cancer studies, gene expression with up and down patterns, and a full spectrum of dimensionality up to over a thousand. Based on their outcome, compared with previous benchmarking results available in the literature, <monospace>FC</monospace> turns out to be among the fastest internal validation methods, while retaining the same outstanding precision of <monospace>Consensus</monospace>. Moreover, it also provides a consensus matrix that can be used as a dissimilarity matrix, guaranteeing the same performance as the corresponding matrix produced by <monospace>Consensus</monospace>. We have also experimented with the use of <monospace>Consensus</monospace> and <monospace>FC</monospace> in conjunction with <monospace>NMF</monospace> (Nonnegative Matrix Factorization), in order to identify the correct number of clusters in a dataset. Although <monospace>NMF</monospace> is an increasingly popular technique for biological data mining, our results are somewhat disappointing and complement quite well the state of the art about <monospace>NMF</monospace>, shedding further light on its merits and limitations. Conclusions In summary, <monospace>FC</monospace> with a parameter setting that makes it robust with respect to small and medium-sized datasets, i.e, number of items to cluster in the hundreds and number of conditions up to a thousand, seems to be the internal validation measure of choice. Moreover, the technique we have developed here can be used in other contexts, in particular for the speed-up of stability-based validation measures.</p

Archivio istituzionale della ricerca - Università di Palermo

Unraveling gene regulatory networks from time-resolved gene expression data -- a measures comparison study

Peer reviewedPublisher PD

Aberdeen University Research

Repositorium für Naturwissenschaften und Technik

MPG.PuRe

Equilibrium reconstruction for Single Helical Axis reversed field pinch plasmas

Author: A Alfier
A Canton
A Fassina
B Momo
Bodin H A B
Cappello S Paccagnella R Sindoni E
Carraro L
D Terranova
D'Haeseleer W D
E Martines
F Bonomo
Fitzpatrick R
Franz P
Ji H
Marrelli L
Martini S
Menmuir S
Ortolani S
P Franz
P Innocente
P Zanca
Pereverzev G Yushmanov P N
Piovesan P
Puiatti M E
Pustovitov V D
R Lorenzini
Valisa M
Zanca P
Publication venue: 'IOP Publishing'
Publication date: 11/01/2011
Field of study

Single Helical Axis (SHAx) configurations are emerging as the natural state for high current reversed field pinch (RFP) plasmas. These states feature the presence of transport barriers in the core plasma. Here we present a method for computing the equilibrium magnetic surfaces for these states in the force-free approximation, which has been implemented in the SHEq code. The method is based on the superposition of a zeroth order axisymmetric equilibrium and of a first order helical perturbation computed according to Newcomb's equation supplemented with edge magnetic field measurements. The mapping of the measured electron temperature profiles, soft X-ray emission and interferometric density measurements on the computed magnetic surfaces demonstrates the quality of the equilibrium reconstruction. The procedure for computing flux surface averages is illustrated, and applied to the evaluation of the thermal conductivity profile. The consistency of the evaluated equilibria with Ohm's law is also discussed.Comment: Submitted to Plasma Physics and Controlled Fusio

arXiv.org e-Print Archive

Bayesian approaches to reverse engineer cellular systems: a simulation study on nonlinear Gaussian networks

Author: A Bernard
AA Margolin
AJ Hartemink
BE Perrin
D Husmeier
DE Zak
Fulvia Ferrazzi
GF Cooper
H de Jong
IM Ong
J Yu
K Murphy
KC Chen
M Zou
Marco F Ramoni
N Dojer
N Friedman
N Friedman
N Friedman
N Nariai
P D'haeseleer
P Le Phillip
P Sebastiani
Paola Sebastiani
Riccardo Bellazzi
S Imoto
S Kim
VA Smith
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND. Reverse engineering cellular networks is currently one of the most challenging problems in systems biology. Dynamic Bayesian networks (DBNs) seem to be particularly suitable for inferring relationships between cellular variables from the analysis of time series measurements of mRNA or protein concentrations. As evaluating inference results on a real dataset is controversial, the use of simulated data has been proposed. However, DBN approaches that use continuous variables, thus avoiding the information loss associated with discretization, have not yet been extensively assessed, and most of the proposed approaches have dealt with linear Gaussian models. RESULTS. We propose a generalization of dynamic Gaussian networks to accommodate nonlinear dependencies between variables. As a benchmark dataset to test the new approach, we used data from a mathematical model of cell cycle control in budding yeast that realistically reproduces the complexity of a cellular system. We evaluated the ability of the networks to describe the dynamics of cellular systems and their precision in reconstructing the true underlying causal relationships between variables. We also tested the robustness of the results by analyzing the effect of noise on the data, and the impact of a different sampling time. CONCLUSION. The results confirmed that DBNs with Gaussian models can be effectively exploited for a first level analysis of data from complex cellular systems. The inferred models are parsimonious and have a satisfying goodness of fit. Furthermore, the networks not only offer a phenomenological description of the dynamics of cellular systems, but are also able to suggest hypotheses concerning the causal interactions between variables. The proposed nonlinear generalization of Gaussian models yielded models characterized by a slightly lower goodness of fit than the linear model, but a better ability to recover the true underlying connections between variables.Italian Ministry of University and Scientific Research; National Institutes of Health & National Human Genome Research Institute (HG003354-01A2); Collegio Ghislieri, Pavia Italy fellowshi

CiteSeerX

Boston University Institutional Repository (OpenBU)

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Mapping transcription mechanisms from multimodal genomic data

Author: AC Nica
AJ Butte
AM Wentzell
CG Mullighan
CT Fong
DJC MacKay
G Alterovitz
G Alterovitz
G Zaza
Gil Alterovitz
H-H Chang
HH Chang
Hsun-Hsien Chang
I Joliffe
J Rissanen
J Roman-Gomez
J Xiao
JT Forton
K Yamakawa
L Vitale
L Vitale
M Morley
Marco F Ramoni
Michael McGeachie
P D'Haeseleer
RA Roela
RS Huang
S Bungaro
S Suthram
TA Drake
TFC Mackay
VG Cheung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2010
Field of study

Background Identification of expression quantitative trait loci (eQTLs) is an emerging area in genomic study. The task requires an integrated analysis of genome-wide single nucleotide polymorphism (SNP) data and gene expression data, raising a new computational challenge due to the tremendous size of data. Results We develop a method to identify eQTLs. The method represents eQTLs as information flux between genetic variants and transcripts. We use information theory to simultaneously interrogate SNP and gene expression data, resulting in a Transcriptional Information Map (TIM) which captures the network of transcriptional information that links genetic variations, gene expression and regulatory mechanisms. These maps are able to identify both cis- and trans- regulating eQTLs. The application on a dataset of leukemia patients identifies eQTLs in the regions of the GART, PCP4, DSCAM, and RIPK4 genes that regulate ADAMTS1, a known leukemia correlate. Conclusions The information theory approach presented in this paper is able to infer the dependence networks between SNPs and transcripts, which in turn can identify cis- and trans-eQTLs. The application of our method to the leukemia study explains how genetic variants and gene expression are linked to leukemia.National Human Genome Research Institute (U.S.) (R01HG003354)National Institute of Allergy and Infectious Diseases (U.S.) (U19 AI067854-05)National Heart, Lung, and Blood Institute (grant T32 HL007427-28)National Institutes of Health (U.S.) (grant K99 LM009826

DSpace@MIT