Search CORE

25,497 research outputs found

Constrained Co-clustering of Gene Expression Data

Author: Boulicaut J. F.
Pensa Ruggero Gaetano
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/2008
Field of study

Institutional Research Information System University of Turin

Semi-supervised learning for the identification of syn-expressed genes from fused microarray and in situ image data

Author: A Schliep
A Schliep
A Schliep
Alexander Schliep
B Edgar
C Niehrs
CLL Hendriks
D Tautz
EP Xing
G McLachlan
GJ McLachlan
H Ge
H Peng
H Peng
I Costa
I Lee
Ivan G Costa
J Bilmes
J Ernst
JY Pan
KY Yeung
KY Yeung
L Opitz
Lennart Opitz
M Ashburner
M Leptin
M Medvedovic
MB Eisen
MN Arbeitman
P Tomancak
P Tomancak
R Gonzalez
R Sokal
Roland Krause
SD Hooper
SK Ng
SVE Keränen
T Beissbarth
T Lange
V Stolc
W Pan
Y Luan
Z Bar-Joseph
Z Bar-Joseph
Z Lu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: Gene expression measurements during the development of the fly Drosophila melanogaster are routinely used to find functional modules of temporally co-expressed genes. Complimentary large data sets of in situ RNA hybridization images for different stages of the fly embryo elucidate the spatial expression patterns. Results: Using a semi-supervised approach, constrained clustering with mixture models, we can find clusters of genes exhibiting spatio-temporal similarities in expression, or syn-expression. The temporal gene expression measurements are taken as primary data for which pairwise constraints are computed in an automated fashion from raw in situ images without the need for manual annotation. We investigate the influence of these pairwise constraints in the clustering and discuss the biological relevance of our results. Conclusion: Spatial information contributes to a detailed, biological meaningful analysis of temporal gene expression data. Semi-supervised learning provides a flexible, robust and efficient framework for integrating data sources of differing quality and abundance

Crossref

Springer - Publisher Connector

PubMed Central

MPG.PuRe

Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks

Author: Baliga Nitin S
Bonneau Richard
Reiss David J
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The learning of global genetic regulatory networks from expression data is a severely under-constrained problem that is aided by reducing the dimensionality of the search space by means of clustering genes into putatively co-regulated groups, as opposed to those that are simply co-expressed. Be cause genes may be co-regulated only across a subset of all observed experimental conditions, biclustering (clustering of genes and conditions) is more appropriate than standard clustering. Co-regulated genes are also often functionally (physically, spatially, genetically, and/or evolutionarily) associated, and such a priori known or pre-computed associations can provide support for appropriately grouping genes. One important association is the presence of one or more common cis-regulatory motifs. In organisms where these motifs are not known, their de novo detection, integrated into the clustering algorithm, can help to guide the process towards more biologically parsimonious solutions. RESULTS: We have developed an algorithm, cMonkey, that detects putative co-regulated gene groupings by integrating the biclustering of gene expression data and various functional associations with the de novo detection of sequence motifs. CONCLUSION: We have applied this procedure to the archaeon Halobacterium NRC-1, as part of our efforts to decipher its regulatory network. In addition, we used cMonkey on public data for three organisms in the other two domains of life: Helicobacter pylori, Saccharomyces cerevisiae, and Escherichia coli. The biclusters detected by cMonkey both recapitulated known biology and enabled novel predictions (some for Halobacterium were subsequently confirmed in the laboratory). For example, it identified the bacteriorhodopsin regulon, assigned additional genes to this regulon with apparently unrelated function, and detected its known promoter motif. We have performed a thorough comparison of cMonkey results against other clustering methods, and find that cMonkey biclusters are more parsimonious with all available evidence for co-regulation

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Robust filtering for gene expression time series data with variance constraints

Author: Fraser K
Liu X
Shu H
Wang Z
Wei G
Publication venue: 'Informa UK Limited'
Publication date: 01/05/2007
Field of study

This is the post print version of the article. The official published version can be obtained from the link below - Copyright 2007 Taylor & Francis Ltd.In this paper, an uncertain discrete-time stochastic system is employed to represent a model for gene regulatory networks from time series data. A robust variance-constrained filtering problem is investigated for a gene expression model with stochastic disturbances and norm-bounded parameter uncertainties, where the stochastic perturbation is in the form of a scalar Gaussian white noise with constant variance and the parameter uncertainties enter both the system matrix and the output matrix. The purpose of the addressed robust filtering problem is to design a linear filter such that, for the admissible bounded uncertainties, the filtering error system is Schur stable and the individual error variance is less than a prespecified upper bound. By using the linear matrix inequality (LMI) technique, sufficient conditions are first derived for ensuring the desired filtering performance for the gene expression model. Then the filter gain is characterized in terms of the solution to a set of LMIs, which can easily be solved by using available software packages. A simulation example is exploited for a gene expression model in order to demonstrate the effectiveness of the proposed design procedures.This work was supported in part by the Engineering and Physical Sciences Research Council (EPSRC) of the UK under Grants GR/S27658/01 and EP/C524586/1, the Biotechnology and Biological Sciences Research Council (BBSRC) of the UK under Grants BB/C506264/1 and 100/EGM17735, the Nuffield Foundation of the UK under Grant NAL/00630/G, and the Alexander von Humboldt Foundation of Germany

Brunel University Research Archive

Statistical modelling of transcript profiles of differentially regulated genes

Author: Burton Kerry S.
Eastwood Daniel C.
Mead A. (Andrew)
Sergeant Martin J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Background: The vast quantities of gene expression profiling data produced in microarray studies, and the more precise quantitative PCR, are often not statistically analysed to their full potential. Previous studies have summarised gene expression profiles using simple descriptive statistics, basic analysis of variance (ANOVA) and the clustering of genes based on simple models fitted to their expression profiles over time. We report the novel application of statistical non-linear regression modelling techniques to describe the shapes of expression profiles for the fungus Agaricus bisporus, quantified by PCR, and for E. coli and Rattus norvegicus, using microarray technology. The use of parametric non-linear regression models provides a more precise description of expression profiles, reducing the "noise" of the raw data to produce a clear "signal" given by the fitted curve, and describing each profile with a small number of biologically interpretable parameters. This approach then allows the direct comparison and clustering of the shapes of response patterns between genes and potentially enables a greater exploration and interpretation of the biological processes driving gene expression. Results: Quantitative reverse transcriptase PCR-derived time-course data of genes were modelled. "Splitline" or "broken-stick" regression identified the initial time of gene up-regulation, enabling the classification of genes into those with primary and secondary responses. Five-day profiles were modelled using the biologically-oriented, critical exponential curve, y(t) = A + (B + Ct)Rt + ε. This non-linear regression approach allowed the expression patterns for different genes to be compared in terms of curve shape, time of maximal transcript level and the decline and asymptotic response levels. Three distinct regulatory patterns were identified for the five genes studied. Applying the regression modelling approach to microarray-derived time course data allowed 11% of the Escherichia coli features to be fitted by an exponential function, and 25% of the Rattus norvegicus features could be described by the critical exponential model, all with statistical significance of p < 0.05. Conclusion: The statistical non-linear regression approaches presented in this study provide detailed biologically oriented descriptions of individual gene expression profiles, using biologically variable data to generate a set of defining parameters. These approaches have application to the modelling and greater interpretation of profiles obtained across a wide range of platforms, such as microarrays. Through careful choice of appropriate model forms, such statistical regression approaches allow an improved comparison of gene expression profiles, and may provide an approach for the greater understanding of common regulatory mechanisms between genes

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Cronfa at Swansea University

Inferring gene regulatory networks using ensembles of feature selection techniques

Author: Demeester Piet
Dhaene Tom
Geurts Pierre
Huynh-thu Vân anh
Ruyssinck Joeri
Saeys Yvan
Publication venue
Publication date: 01/01/2012
Field of study

Ghent University Academic Bibliography