67,699 research outputs found
Regression models with MoPs Bayesian networks
We present a model of Bayesian network for continuous variables, where densities and conditional densities are estimated with B-spline MoPs. We use a novel approach to directly obtain conditional densities estimation using B-spline properties. In particular we implement naive Bayes and wrapper variables selection. Finally we apply our techniques to the problem of predicting neurons morphological variables from electrophysiological ones
Bayesian Analysis of Switching ARCH Models
We consider a time series model with autoregressive conditional heteroskedasticity that is subject to changes in regime. The regimes evolve according to a multistate latent Markov switching process with unknown transition probabilities, and it is the constant in the variance process of the innovations that is subject to regime shifts. The joint estimation of the latent process and all model parameters is performed within a Bayesian framework using the method of Markov Chain Monte Carlo simulation. One iteration of the sampler involves first a multi-move step to simulate the latent process out of its conditional distribution. The Gibbs sampler can then be used to simulate the parameters, in particular the transition probabilities, for which the full conditional posterior distribution is known. For most parameters, however, the full conditionals do not belong to any well-known family of distributions. The simulations are then based on the Metropolis-Hastings algorithm with carefully chosen proposal densities. We perform model selection with respect to the number of states and the number of autoregressive parameters in the variance process using Bayes factors and model likelihoods. To this aim, the model likelihood is estimated by combining the candidate's formula with importance sampling. The usefulness of the sampler is demonstrated by applying it to the dataset previously used by Hamilton and Susmel who investigated models with switching autoregressive conditional heteroskedasticity using maximum likelihood methods. The paper concludes with some issues related to maximum likelihood methods, to classical model selection, and to potential straightforward extensions of the model presented here.
Selection and Estimation for Mixed Graphical Models
We consider the problem of estimating the parameters in a pairwise graphical
model in which the distribution of each node, conditioned on the others, may
have a different parametric form. In particular, we assume that each node's
conditional distribution is in the exponential family. We identify restrictions
on the parameter space required for the existence of a well-defined joint
density, and establish the consistency of the neighbourhood selection approach
for graph reconstruction in high dimensions when the true underlying graph is
sparse. Motivated by our theoretical results, we investigate the selection of
edges between nodes whose conditional distributions take different parametric
forms, and show that efficiency can be gained if edge estimates obtained from
the regressions of particular nodes are used to reconstruct the graph. These
results are illustrated with examples of Gaussian, Bernoulli, Poisson and
exponential distributions. Our theoretical findings are corroborated by
evidence from simulation studies
The distribution of a linear predictor after model selection: Unconditional finite-sample distributions and asymptotic approximations
We analyze the (unconditional) distribution of a linear predictor that is
constructed after a data-driven model selection step in a linear regression
model. First, we derive the exact finite-sample cumulative distribution
function (cdf) of the linear predictor, and a simple approximation to this
(complicated) cdf. We then analyze the large-sample limit behavior of these
cdfs, in the fixed-parameter case and under local alternatives.Comment: Published at http://dx.doi.org/10.1214/074921706000000518 in the IMS
Lecture Notes--Monograph Series
(http://www.imstat.org/publications/lecnotes.htm) by the Institute of
Mathematical Statistics (http://www.imstat.org
Multiscale Dictionary Learning for Estimating Conditional Distributions
Nonparametric estimation of the conditional distribution of a response given
high-dimensional features is a challenging problem. It is important to allow
not only the mean but also the variance and shape of the response density to
change flexibly with features, which are massive-dimensional. We propose a
multiscale dictionary learning model, which expresses the conditional response
density as a convex combination of dictionary densities, with the densities
used and their weights dependent on the path through a tree decomposition of
the feature space. A fast graph partitioning algorithm is applied to obtain the
tree decomposition, with Bayesian methods then used to adaptively prune and
average over different sub-trees in a soft probabilistic manner. The algorithm
scales efficiently to approximately one million features. State of the art
predictive performance is demonstrated for toy examples and two neuroscience
applications including up to a million features
New Estimation Approaches for the Hierarchical Linear Ballistic Accumulator Model
The Linear Ballistic Accumulator (Brown & Heathcote, 2008) model is used as a
measurement tool to answer questions about applied psychology. The analyses
based on this model depend upon the model selected and its estimated
parameters. Modern approaches use hierarchical Bayesian models and Markov chain
Monte-Carlo (MCMC) methods to estimate the posterior distribution of the
parameters. Although there are several approaches available for model
selection, they are all based on the posterior samples produced via MCMC, which
means that the model selection inference inherits the properties of the MCMC
sampler. To improve on current approaches to LBA inference we propose two
methods that are based on recent advances in particle MCMC methodology; they
are qualitatively different from existing approaches as well as from each
other. The first approach is particle Metropolis-within-Gibbs; the second
approach is density tempered sequential Monte Carlo. Both new approaches
provide very efficient sampling and can be applied to estimate the marginal
likelihood, which provides Bayes factors for model selection. The first
approach is usually faster. The second approach provides a direct estimate of
the marginal likelihood, uses the first approach in its Markov move step and is
very efficient to parallelize on high performance computers. The new methods
are illustrated by applying them to simulated and real data, and through pseudo
code. The code implementing the methods is freely available.Comment: 35 pages, 6 figures, 7 table
Conditional Density Estimation by Penalized Likelihood Model Selection and Applications
In this technical report, we consider conditional density estimation with a
maximum likelihood approach. Under weak assumptions, we obtain a theoretical
bound for a Kullback-Leibler type loss for a single model maximum likelihood
estimate. We use a penalized model selection technique to select a best model
within a collection. We give a general condition on penalty choice that leads
to oracle type inequality for the resulting estimate. This construction is
applied to two examples of partition-based conditional density models, models
in which the conditional density depends only in a piecewise manner from the
covariate. The first example relies on classical piecewise polynomial densities
while the second uses Gaussian mixtures with varying mixing proportion but same
mixture components. We show how this last case is related to an unsupervised
segmentation application that has been the source of our motivation to this
study.Comment: No. RR-7596 (2011
- …