Search CORE

111 research outputs found

Constrained mixture estimation for analysis and robust classification of clinical time series

Author: Alexander Schliep
Alexander Schönhuth
Archelos
Bar-Joseph
Baranzini
Basu
Bilmes
Borgwardt
Brunet
Castelli
Chapelle
Christoph Hafemeister
Costa
Eisen
Ernst
Fraley
Hastie
Irizarry
Ivan G. Costa
Kaminski
Lange
Lin
Lottaz
Lu
MacLachlan
Monti
Nelms
Nigam
Reimand
Ro
Satoh
Schliep
Schliep
Schliep
Schönhuth
Spang
van Baarsen
van't Veer
Yang
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Personalized medicine based on molecular aspects of diseases, such as gene expression profiling, has become increasingly popular. However, one faces multiple challenges when analyzing clinical gene expression data; most of the well-known theoretical issues such as high dimension of feature spaces versus few examples, noise and missing data apply. Special care is needed when designing classification procedures that support personalized diagnosis and choice of treatment. Here, we particularly focus on classification of interferon-β (IFNβ) treatment response in Multiple Sclerosis (MS) patients which has attracted substantial attention in the recent past. Half of the patients remain unaffected by IFNβ treatment, which is still the standard. For them the treatment should be timely ceased to mitigate the side effects

CiteSeerX

Crossref

PubMed Central

A temporal switch model for estimating transcriptional activity in gene expression

Author: Finkenstädt Bärbel
Jenkins Dafyd J.
Rand D. A. (David A.)
Publication venue: Oxford University Press
Publication date: 11/03/2013
Field of study

Motivation: The analysis and mechanistic modelling of time series gene expression data provided by techniques such as microarrays, NanoString, reverse transcription–polymerase chain reaction and advanced sequencing are invaluable for developing an understanding of the variation in key biological processes. We address this by proposing the estimation of a flexible dynamic model, which decouples temporal synthesis and degradation of mRNA and, hence, allows for transcriptional activity to switch between different states. Results: The model is flexible enough to capture a variety of observed transcriptional dynamics, including oscillatory behaviour, in a way that is compatible with the demands imposed by the quality, time-resolution and quantity of the data. We show that the timing and number of switch events in transcriptional activity can be estimated alongside individual gene mRNA stability with the help of a Bayesian reversible jump Markov chain Monte Carlo algorithm. To demonstrate the methodology, we focus on modelling the wild-type behaviour of a selection of 200 circadian genes of the model plant Arabidopsis thaliana. The results support the idea that using a mechanistic model to identify transcriptional switch points is likely to strongly contribute to efforts in elucidating and understanding key biological processes, such as transcription and degradation

Warwick Research Archives Portal Repository

A statistical analysis of memory CD8 T cell differentiation: An application of a hierarchical state space model to a short time course microarray experiment

Author: Halloran M. Elizabeth
Kaech Susan M.
Wu Haiyan
Yuan Ming
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 07/12/2007
Field of study

CD8 T cells are specialized immune cells that play an important role in the regulation of antiviral immune response and the generation of protective immunity. In this paper we investigate the differentiation of memory CD8 T cells in the immune response using a short time course microarray experiment. Structurally, this experiment is similar to many in that it involves measurements taken on independent samples, in one biological group, at a small number of irregularly spaced time points, and exhibiting patterns of temporal nonstationarity. To analyze this CD8 T-cell experiment, we develop a hierarchical state space model so that we can: (1) detect temporally differentially expressed genes, (2) identify the direction of successive changes over time, and (3) assess the magnitude of successive changes over time. We incorporate hidden Markov models into our model to utilize the information embedded in the time series and set up the proposed hierarchical state space model in an empirical Bayes framework to utilize the population information from the large-scale data. Analysis of the CD8 T-cell experiment using the proposed model results in biologically meaningful findings. Temporal patterns involved in the differentiation of memory CD8 T cells are summarized separately and performance of the proposed model is illustrated in a simulation study.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS118 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Semi-supervised learning for the identification of syn-expressed genes from fused microarray and in situ image data

Author: A Schliep
A Schliep
A Schliep
Alexander Schliep
B Edgar
C Niehrs
CLL Hendriks
D Tautz
EP Xing
G McLachlan
GJ McLachlan
H Ge
H Peng
H Peng
I Costa
I Lee
Ivan G Costa
J Bilmes
J Ernst
JY Pan
KY Yeung
KY Yeung
L Opitz
Lennart Opitz
M Ashburner
M Leptin
M Medvedovic
MB Eisen
MN Arbeitman
P Tomancak
P Tomancak
R Gonzalez
R Sokal
Roland Krause
SD Hooper
SK Ng
SVE Keränen
T Beissbarth
T Lange
V Stolc
W Pan
Y Luan
Z Bar-Joseph
Z Bar-Joseph
Z Lu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: Gene expression measurements during the development of the fly Drosophila melanogaster are routinely used to find functional modules of temporally co-expressed genes. Complimentary large data sets of in situ RNA hybridization images for different stages of the fly embryo elucidate the spatial expression patterns. Results: Using a semi-supervised approach, constrained clustering with mixture models, we can find clusters of genes exhibiting spatio-temporal similarities in expression, or syn-expression. The temporal gene expression measurements are taken as primary data for which pairwise constraints are computed in an automated fashion from raw in situ images without the need for manual annotation. We investigate the influence of these pairwise constraints in the clustering and discuss the biological relevance of our results. Conclusion: Spatial information contributes to a detailed, biological meaningful analysis of temporal gene expression data. Semi-supervised learning provides a flexible, robust and efficient framework for integrating data sources of differing quality and abundance

Crossref

Springer - Publisher Connector

PubMed Central

MPG.PuRe

Revealing cell cycle control by combining model-based detection of periodic expression with novel cis-regulatory descriptors

Author: Andersson Claes R
Gustafsson Mats G
Hvidsten Torgeir R
Isaksson Anders
Komorowski Jan
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background We address the issue of explaining the presence or absence of phase-specific transcription in budding yeast cultures under different conditions. To this end we use a model-based detector of gene expression periodicity to divide genes into classes depending on their behavior in experiments using different synchronization methods. While computational inference of gene regulatory circuits typically relies on expression similarity (clustering) in order to find classes of potentially co-regulated genes, this method instead takes advantage of known time profile signatures related to the studied process. Results We explain the regulatory mechanisms of the inferred periodic classes with <it>cis</it>-regulatory descriptors that combine upstream sequence motifs with experimentally determined binding of transcription factors. By systematic statistical analysis we show that periodic classes are best explained by combinations of descriptors rather than single descriptors, and that different combinations correspond to periodic expression in different classes. We also find evidence for additive regulation in that the combinations of <it>cis</it>-regulatory descriptors associated with genes periodically expressed in fewer conditions are frequently subsets of combinations associated with genes periodically expression in more conditions. Finally, we demonstrate that our approach retrieves combinations that are more specific towards known cell-cycle related regulators than the frequently used clustering approach. Conclusion The results illustrate how a model-based approach to expression analysis may be particularly well suited to detect biologically relevant mechanisms. Our new approach makes it possible to provide more refined hypotheses about regulatory mechanisms of the cell cycle and it can easily be adjusted to reveal regulation of other, non-periodic, cellular processes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Analysis of Clickstream Data

Author: JAMALZADEH MOHAMMADAMIN
Publication venue
Publication date: 01/01/2011
Field of study

This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are many research question regarding web browsing behaviour. Specifically, we focused on the depth-of-visit metric and implemented an exploratory analysis of this feature using clickstream data. Due to the large volume of data available in this context, we chose to present effect size measures along with all statistical analysis of data. We introduced two new robust measures of effect size for two-sample comparison studies for Non-normal situations, specifically where the difference of two populations is due to the shape parameter. The proposed effect sizes perform adequately for non-normal data, as well as when two distributions differ from shape parameters. We will focus on conversion analysis, to investigate the causal relationship between the general clickstream information and online purchasing using a logistic regression approach. The aim is to find a classifier by assigning the probability of the event of online shopping in an e-commerce website. We also develop the application of a mixture of hidden Markov models (MixHMM) to model web browsing behaviour using sequences of web pages viewed by users of an e-commerce website. The mixture of hidden Markov model will be performed in the Bayesian context using Gibbs sampling. We address the slow mixing problem of using Gibbs sampling in high dimensional models, and use the over-relaxed Gibbs sampling, as well as forward-backward EM algorithm to obtain an adequate sample of the posterior distributions of the parameters. The MixHMM provides an advantage of clustering users based on their browsing behaviour, and also gives an automatic classification of web pages based on the probability of observing web page by visitors in the website

Durham e-Theses

OpenGrey Repository

Proceedings of the 35th International Workshop on Statistical Modelling : July 20- 24, 2020 Bilbao, Basque Country, Spain

Author: International Workshop on Statistical Modelling (35º. 2020. Bilbao)
Irigoyen Garbizu Itziar
Lee Dae-Ji
Martínez-Minaya Joaquín
Rodríguez-Álvarez María Xosé
Publication venue: Servicio Editorial de la Universidad del País Vasco/Euskal Herriko Unibertsitatearen Argitalpen Zerbitzua
Publication date: 01/01/2020
Field of study

466 p.The InternationalWorkshop on Statistical Modelling (IWSM) is a reference workshop in promoting statistical modelling, applications of Statistics for researchers, academics and industrialist in a broad sense. Unfortunately, the global COVID-19 pandemic has not allowed holding the 35th edition of the IWSM in Bilbao in July 2020. Despite the situation and following the spirit of the Workshop and the Statistical Modelling Society, we are delighted to bring you the proceedings book of extended abstracts

Archivo Digital para la Docencia y la Investigación

Markov models of biomolecular systems

Author: Arbon Robert E
Publication venue
Publication date: 28/09/2021
Field of study

Explore Bristol Research

Constructing Temporal Transcriptional Regulatory Cascades in the Context of Development and Cell Differentiation

Author: Daou Rayan
Publication venue
Publication date: 08/05/2020
Field of study

Georg-August-University Göttingen

Recommended from our members

Learning Structure in Time Series for Neuroscience and Beyond

Author: Pfau David Benjamin
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2015
Field of study

Advances in neuroscience are producing data at an astounding rate - data which are fiendishly complex both to process and to interpret. Biological neural networks are high-dimensional, nonlinear, noisy, heterogeneous, and in nearly every way defy the simplifying assumptions of standard statistical methods. In this dissertation we address a number of issues with understanding the structure of neural populations, from the abstract level of how to uncover structure in generic time series, to the practical matter of finding relevant biological structure in state-of-the-art experimental techniques. To learn the structure of generic time series, we develop a new statistical model, which we dub the probabilistic deterministic infinite automata (PDIA), which uses tools from nonparametric Bayesian inference to learn a very general class of sequence models. We show that the models learned by the PDIA often offer better predictive performance and faster inference than Hidden Markov Models, while being significantly more compact than models that simply memorize contexts. For large populations of neurons, models like the PDIA become unwieldy, and we instead investigate ways to robustly reduce the dimensionality of the data. In particular, we adapt the generalized linear model (GLM) framework for regres- sion to the case of matrix completion, which we call the low-dimensional GLM. We show that subspaces and dynamics of neural activity can be accurately recovered from model data, and with only minimal assumptions about the structure of the dynamics can still lead to good predictive performance on real data. Finally, to bridge the gap between recording technology and analysis, particularly as recordings from ever-larger populations of neurons becomes the norm, automated methods for extracting activity from raw recordings become a necessity. We present a number of methods for automatically segmenting biological units from optical imaging data, with applications to light sheet recording of genetically encoded calcium indicator fluorescence in the larval zebrafish, and optical electrophysiology using genetically encoded voltage indicators in culture. Together, these methods are a powerful set of tools for addressing the diverse challenges of modern neuroscience

Columbia University Academic Commons