575 research outputs found
The Bayesian Structural EM Algorithm
In recent years there has been a flurry of works on learning Bayesian
networks from data. One of the hard problems in this area is how to effectively
learn the structure of a belief network from incomplete data- that is, in the
presence of missing values or hidden variables. In a recent paper, I introduced
an algorithm called Structural EM that combines the standard Expectation
Maximization (EM) algorithm, which optimizes parameters, with structure search
for model selection. That algorithm learns networks based on penalized
likelihood scores, which include the BIC/MDL score and various approximations
to the Bayesian score. In this paper, I extend Structural EM to deal directly
with Bayesian model selection. I prove the convergence of the resulting
algorithm and show how to apply it for learning a large class of probabilistic
models, including Bayesian networks and some variants thereof.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in
Artificial Intelligence (UAI1998
Dimension Reduction in Singularly Perturbed Continuous-Time Bayesian Networks
Continuous-time Bayesian networks (CTBNs) are graphical representations of
multi-component continuous-time Markov processes as directed graphs. The edges
in the network represent direct influences among components. The joint rate
matrix of the multi-component process is specified by means of conditional rate
matrices for each component separately. This paper addresses the situation
where some of the components evolve on a time scale that is much shorter
compared to the time scale of the other components. In this paper, we prove
that in the limit where the separation of scales is infinite, the Markov
process converges (in distribution, or weakly) to a reduced, or effective
Markov process that only involves the slow components. We also demonstrate that
for reasonable separation of scale (an order of magnitude) the reduced process
is a good approximation of the marginal process over the slow components. We
provide a simple procedure for building a reduced CTBN for this effective
process, with conditional rate matrices that can be directly calculated from
the original CTBN, and discuss the implications for approximate reasoning in
large systems.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty
in Artificial Intelligence (UAI2006
Being Bayesian about Network Structure
In many domains, we are interested in analyzing the structure of the
underlying distribution, e.g., whether one variable is a direct parent of the
other. Bayesian model-selection attempts to find the MAP model and use its
structure to answer these questions. However, when the amount of available data
is modest, there might be many models that have non-negligible posterior. Thus,
we want compute the Bayesian posterior of a feature, i.e., the total posterior
probability of all models that contain it. In this paper, we propose a new
approach for this task. We first show how to efficiently compute a sum over the
exponential number of networks that are consistent with a fixed ordering over
network variables. This allows us to compute, for a given ordering, both the
marginal probability of the data and the posterior of a feature. We then use
this result as the basis for an algorithm that approximates the Bayesian
posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC)
method, but over orderings rather than over network structures. The space of
orderings is much smaller and more regular than the space of structures, and
has a smoother posterior `landscape'. We present empirical results on synthetic
and real-life datasets that compare our approach to full model averaging (when
possible), to MCMC over network structures, and to a non-Bayesian bootstrap
approach.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in
Artificial Intelligence (UAI2000
On the Sample Complexity of Learning Bayesian Networks
In recent years there has been an increasing interest in learning Bayesian
networks from data. One of the most effective methods for learning such
networks is based on the minimum description length (MDL) principle. Previous
work has shown that this learning procedure is asymptotically successful: with
probability one, it will converge to the target distribution, given a
sufficient number of samples. However, the rate of this convergence has been
hitherto unknown. In this work we examine the sample complexity of MDL based
learning procedures for Bayesian networks. We show that the number of samples
needed to learn an epsilon-close approximation (in terms of entropy distance)
with confidence delta is O((1/epsilon)^(4/3)log(1/epsilon)log(1/delta)loglog
(1/delta)). This means that the sample complexity is a low-order polynomial in
the error threshold and sub-linear in the confidence bound. We also discuss how
the constants in this term depend on the complexity of the target distribution.
Finally, we address questions of asymptotic minimality and propose a method for
using the sample complexity results to speed up the learning process.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in
Artificial Intelligence (UAI1996
The Information Bottleneck EM Algorithm
Learning with hidden variables is a central challenge in probabilistic
graphical models that has important implications for many real-life problems.
The classical approach is using the Expectation Maximization (EM) algorithm.
This algorithm, however, can get trapped in local maxima. In this paper we
explore a new approach that is based on the Information Bottleneck principle.
In this approach, we view the learning problem as a tradeoff between two
information theoretic objectives. The first is to make the hidden variables
uninformative about the identity of specific instances. The second is to make
the hidden variables informative about the observed attributes. By exploring
different tradeoffs between these two objectives, we can gradually converge on
a high-scoring solution. As we show, the resulting, Information Bottleneck
Expectation Maximization (IB-EM) algorithm, manages to find solutions that are
superior to standard EM methods.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in
Artificial Intelligence (UAI2003
Learning the Dimensionality of Hidden Variables
A serious problem in learning probabilistic models is the presence of hidden
variables. These variables are not observed, yet interact with several of the
observed variables. Detecting hidden variables poses two problems: determining
the relations to other variables in the model and determining the number of
states of the hidden variable. In this paper, we address the latter problem in
the context of Bayesian networks. We describe an approach that utilizes a
score-based agglomerative state-clustering. As we show, this approach allows us
to efficiently evaluate models with a range of cardinalities for the hidden
variable. We show how to extend this procedure to deal with multiple
interacting hidden variables. We demonstrate the effectiveness of this approach
by evaluating it on synthetic and real-life data. We show that our approach
learns models with hidden variables that generalize better and have better
structure than previous approaches.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty
in Artificial Intelligence (UAI2001
COARA: Code Offloading on Android with AspectJ
Smartphones suffer from limited computational capabilities and battery life.
A method to mitigate these problems is code offloading: executing application
code on a remote server. We introduce COARA, a middleware platform for code
offloading on Android that uses aspect-oriented programming (AOP) with AspectJ.
AOP allows COARA to intercept code for offloading without a customized compiler
or modification of the operating system. COARA requires minimal changes to
application source code, and does not require the application developer to be
aware of AOP. Since state transfer to the server is often a bottleneck that
hinders performance, COARA uses AOP to intercept the transmission of large
objects from the client and replaces them with object proxies. The server can
begin execution of the offloaded application code, regardless of whether all
required objects been transferred to the server. We run COARA with Android
applications from the Google Play store on a Nexus 4 running unmodified Android
4.3 to prove that our platform improves performance and reduces energy
consumption. Our approach yields speedups of 24x and 6x over WiFi and 3G
respectively
Stable regions and singular trajectories in chaotic soft wall billiards
We present numerical and experimental results for the development of islands
of stability in atom-optics billiards with soft walls. As the walls are soften,
stable regions appear near singular periodic trajectories in converging
(focusing) and dispersing billiards, and are surrounded by areas of
"stickiness" in phase-space. The size of these islands depends on the softness
of the potential in a very sensitive way
Atom-Optics Billiards: Non-linear dynamics with cold atoms in optical traps
We present a new experimental system (the ``atom-optics billiard'') and
demonstrate chaotic and regular dynamics of cold, optically trapped atoms. We
show that the softness of the walls and additional optical potentials can be
used to manipulate the structure of phase space.Comment: Lecture notes from the NATO ASI International Summer School on
Chaotic Dynamics and Transport in Classical and Quantum Systems, Cargese,
Corsica, August 200
Modeling Belief in Dynamic Systems, Part II: Revisions and Update
The study of belief change has been an active area in philosophy and AI. In
recent years two special cases of belief change, belief revision and belief
update, have been studied in detail. In a companion paper, we introduce a new
framework to model belief change. This framework combines temporal and
epistemic modalities with a notion of plausibility, allowing us to examine the
change of beliefs over time. In this paper, we show how belief revision and
belief update can be captured in our framework. This allows us to compare the
assumptions made by each method, and to better understand the principles
underlying them. In particular, it shows that Katsuno and Mendelzon's notion of
belief update depends on several strong assumptions that may limit its
applicability in artificial intelligence. Finally, our analysis allow us to
identify a notion of minimal change that underlies a broad range of belief
change operations including revision and update
- …