41,674 research outputs found
Consistent Second-Order Conic Integer Programming for Learning Bayesian Networks
Bayesian Networks (BNs) represent conditional probability relations among a
set of random variables (nodes) in the form of a directed acyclic graph (DAG),
and have found diverse applications in knowledge discovery. We study the
problem of learning the sparse DAG structure of a BN from continuous
observational data. The central problem can be modeled as a mixed-integer
program with an objective function composed of a convex quadratic loss function
and a regularization penalty subject to linear constraints. The optimal
solution to this mathematical program is known to have desirable statistical
properties under certain conditions. However, the state-of-the-art optimization
solvers are not able to obtain provably optimal solutions to the existing
mathematical formulations for medium-size problems within reasonable
computational times. To address this difficulty, we tackle the problem from
both computational and statistical perspectives. On the one hand, we propose a
concrete early stopping criterion to terminate the branch-and-bound process in
order to obtain a near-optimal solution to the mixed-integer program, and
establish the consistency of this approximate solution. On the other hand, we
improve the existing formulations by replacing the linear "big-" constraints
that represent the relationship between the continuous and binary indicator
variables with second-order conic constraints. Our numerical results
demonstrate the effectiveness of the proposed approaches
Bayesian Dropout
Dropout has recently emerged as a powerful and simple method for training
neural networks preventing co-adaptation by stochastically omitting neurons.
Dropout is currently not grounded in explicit modelling assumptions which so
far has precluded its adoption in Bayesian modelling. Using Bayesian entropic
reasoning we show that dropout can be interpreted as optimal inference under
constraints. We demonstrate this on an analytically tractable regression model
providing a Bayesian interpretation of its mechanism for regularizing and
preventing co-adaptation as well as its connection to other Bayesian
techniques. We also discuss two general approximate techniques for applying
Bayesian dropout for general models, one based on an analytical approximation
and the other on stochastic variational techniques. These techniques are then
applied to a Baysian logistic regression problem and are shown to improve
performance as the model become more misspecified. Our framework roots dropout
as a theoretically justified and practical tool for statistical modelling
allowing Bayesians to tap into the benefits of dropout training.Comment: 21 pages, 3 figures. Manuscript prepared 2014 and awaiting submissio
Learning and Designing Stochastic Processes from Logical Constraints
Stochastic processes offer a flexible mathematical formalism to model and
reason about systems. Most analysis tools, however, start from the premises
that models are fully specified, so that any parameters controlling the
system's dynamics must be known exactly. As this is seldom the case, many
methods have been devised over the last decade to infer (learn) such parameters
from observations of the state of the system. In this paper, we depart from
this approach by assuming that our observations are {\it qualitative}
properties encoded as satisfaction of linear temporal logic formulae, as
opposed to quantitative observations of the state of the system. An important
feature of this approach is that it unifies naturally the system identification
and the system design problems, where the properties, instead of observations,
represent requirements to be satisfied. We develop a principled statistical
estimation procedure based on maximising the likelihood of the system's
parameters, using recent ideas from statistical machine learning. We
demonstrate the efficacy and broad applicability of our method on a range of
simple but non-trivial examples, including rumour spreading in social networks
and hybrid models of gene regulation
Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces
In practical Bayesian optimization, we must often search over structures with
differing numbers of parameters. For instance, we may wish to search over
neural network architectures with an unknown number of layers. To relate
performance data gathered for different architectures, we define a new kernel
for conditional parameter spaces that explicitly includes information about
which parameters are relevant in a given structure. We show that this kernel
improves model quality and Bayesian optimization results over several simpler
baseline kernels.Comment: 6 pages, 3 figures. Appeared in the NIPS 2013 workshop on Bayesian
optimizatio
- …