139 research outputs found
Joint Structure Learning of Multiple Non-Exchangeable Networks
Several methods have recently been developed for joint structure learning of
multiple (related) graphical models or networks. These methods treat individual
networks as exchangeable, such that each pair of networks are equally
encouraged to have similar structures. However, in many practical applications,
exchangeability in this sense may not hold, as some pairs of networks may be
more closely related than others, for example due to group and sub-group
structure in the data. Here we present a novel Bayesian formulation that
generalises joint structure learning beyond the exchangeable case. In addition
to a general framework for joint learning, we (i) provide a novel default prior
over the joint structure space that requires no user input; (ii) allow for
latent networks; (iii) give an efficient, exact algorithm for the case of time
series data and dynamic Bayesian networks. We present empirical results on
non-exchangeable populations, including a real data example from biology, where
cell-line-specific networks are related according to genomic features.Comment: To appear in Proceedings of the Seventeenth International Conference
on Artificial Intelligence and Statistics (AISTATS
Recommended from our members
Uncertainty Quantification
Uncertainty quantification (UQ) is concerned with including and characterising uncertainties in mathematical models.
Major steps comprise proper description of system uncertainties, analysis and efficient quantification of uncertainties in predictions and design problems, and statistical inference on uncertain parameters starting from available measurements.
Research in UQ addresses fundamental mathematical and statistical challenges, but has also wide applicability in areas such as engineering, environmental, physical and biological applications.
This workshop focussed on mathematical challenges at the interface of applied mathematics, probability and statistics, numerical analysis, scientific computing and application domains.
The workshop served to bring together experts from those disciplines in order to enhance their interaction, to exchange ideas and to develop new, powerful methods for UQ
A Tracking Approach to Parameter Estimation in Linear Ordinary Differential Equations
Ordinary Differential Equations are widespread tools to model chemical,
physical, biological process but they usually rely on parameters which are of
critical importance in terms of dynamic and need to be estimated directly from
the data. Classical statistical approaches (nonlinear least squares, maximum
likelihood estimator) can give unsatisfactory results because of computational
difficulties and ill-posedness of the statistical problem. New estimation
methods that use some nonparametric devices have been proposed to circumvent
these issues. We present a new estimator that shares properties with Two-Step
estimator and Generalized Smoothing (introduced by Ramsay et al, 2007). We
introduce a perturbed model and we use optimal control theory for constructing
a criterion that aims at minimizing the discrepancy with data and the model.
Here, we focus on the case of linear Ordinary Differential Equations as our
criterion has a closed-form expression that permits a detailed analysis. Our
approach avoids the use of a nonparametric estimator of the derivative, which
is one of the main cause of inaccuracy in Two-Step estimators. Moreover, we
take into account model discrepancy and our estimator is more robust to model
misspecification than classical methods. The discrepancy with the parametric
ODE model correspond to the minimum perturbation (or control) to apply to the
initial model. Its qualitative analysis can be informative for misspecification
diagnosis. In the case of well-specified model, we show the consistency of our
estimator and that we reach the parametric root-n rate when regression splines
are used in the first step.Comment: 41 pages, 3 figure
Stochastic Reaction-Diffusion Systems in Biophysics: Towards a Toolbox for Quantitative Model Evaluation
We develop a statistical toolbox for a quantitative model evaluation of
stochastic reaction-diffusion systems modeling space-time evolution of
biophysical quantities on the intracellular level. Starting from space-time
data , as, e.g., provided in fluorescence microscopy recordings, we
discuss basic modelling principles for conditional mean trend and fluctuations
in the class of stochastic reaction-diffusion systems, and subsequently develop
statistical inference methods for parameter estimation. With a view towards
application to real data, we discuss estimation errors and confidence
intervals, in particular in dependence of spatial resolution of measurements,
and investigate the impact of misspecified reaction terms and noise
coefficients. We also briefly touch implementation issues of the statistical
estimators. As a proof of concept we apply our toolbox to the statistical
inference on intracellular actin concentration in the social amoeba
Dictyostelium discoideum
Modeling Persistent Trends in Distributions
We present a nonparametric framework to model a short sequence of probability
distributions that vary both due to underlying effects of sequential
progression and confounding noise. To distinguish between these two types of
variation and estimate the sequential-progression effects, our approach
leverages an assumption that these effects follow a persistent trend. This work
is motivated by the recent rise of single-cell RNA-sequencing experiments over
a brief time course, which aim to identify genes relevant to the progression of
a particular biological process across diverse cell populations. While
classical statistical tools focus on scalar-response regression or
order-agnostic differences between distributions, it is desirable in this
setting to consider both the full distributions as well as the structure
imposed by their ordering. We introduce a new regression model for ordinal
covariates where responses are univariate distributions and the underlying
relationship reflects consistent changes in the distributions over increasing
levels of the covariate. This concept is formalized as a "trend" in
distributions, which we define as an evolution that is linear under the
Wasserstein metric. Implemented via a fast alternating projections algorithm,
our method exhibits numerous strengths in simulations and analyses of
single-cell gene expression data.Comment: To appear in: Journal of the American Statistical Associatio
Low-level analysis of microarray data
This thesis consists of an extensive introduction followed by seven papers (A-F) on low-level analysis of microarray data. Focus is on calibration and normalization of observed data. The introduction gives a brief background of the microarray technology and its applications in order for anyone not familiar with the field to read the thesis. Formal definitions of calibration and normalization are given. Paper A illustrates a typical statistical analysis of microarray data with background correction, normalization, and identification of differentially expressed genes (among thousands of candidates). A small analysis on the final results for different number of replicates and different image analysis software is also given. Paper B introduces a novel way for displaying microarray data called the print-order plot, which displays data in the order the corresponding spots were printed to the array. Utilizing these, so called (microtiter-) plate effects are identified. Then, based on a simple variability measure for replicated spots across arrays, different normalization sequences are tested and evidence for the existence of plate effects are claimed. Paper C presents an object-oriented extension with transparent reference variables to the R language. It is provides the necessary foundation in order to implement the microarray analysis package described in Paper F. Paper D is on affine transformations of two-channel microarray data and their effects on the log-ratio log-intensity transform. Affine transformations, that is, the existence of channel biases, can explain commonly observed intensity-dependent effects in the log-ratios. In the light of the affine transformation, several normalization methods are revisited. At the end of the paper, a new robust affine normalization is suggested that relies on iterative reweighted principal component analysis. Paper E suggests a multiscan calibration method where each array is scanned at various sensitivity levels in order to uniquely identify the affine transformation of signals that the scanner and the image-analysis methods introduce. Observed data strongly support this method. In addition, multiscan-calibrated data has an extended dynamical range and higher signal-to-noise levels. This is real-world evidence for the existence of affine transformations of microarray data. Paper F describes the aroma package – An R Object-oriented Microarray Analysis environment – implemented in R and that provides easy access to our and others low-level analysis methods. Paper G provides an calibration method for spotted microarrays with dilution series or spike-ins. The method is based on a heteroscedastic affine stochastic model. The parameter estimates are robust against model misspecification
Untangling hotel industry’s inefficiency: An SFA approach applied to a renowned Portuguese hotel chain
The present paper explores the technical efficiency of four hotels from Teixeira Duarte Group - a renowned Portuguese hotel chain. An efficiency ranking is established from these four hotel units located in Portugal using Stochastic Frontier Analysis. This methodology allows to discriminate between measurement error and systematic inefficiencies in the estimation process enabling to investigate the main inefficiency causes. Several suggestions concerning efficiency improvement are undertaken for each hotel studied.info:eu-repo/semantics/publishedVersio
- …