568 research outputs found

    The Bayesian Structural EM Algorithm

    Full text link
    In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data- that is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.Comment: Appears in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998

    Dimension Reduction in Singularly Perturbed Continuous-Time Bayesian Networks

    Full text link
    Continuous-time Bayesian networks (CTBNs) are graphical representations of multi-component continuous-time Markov processes as directed graphs. The edges in the network represent direct influences among components. The joint rate matrix of the multi-component process is specified by means of conditional rate matrices for each component separately. This paper addresses the situation where some of the components evolve on a time scale that is much shorter compared to the time scale of the other components. In this paper, we prove that in the limit where the separation of scales is infinite, the Markov process converges (in distribution, or weakly) to a reduced, or effective Markov process that only involves the slow components. We also demonstrate that for reasonable separation of scale (an order of magnitude) the reduced process is a good approximation of the marginal process over the slow components. We provide a simple procedure for building a reduced CTBN for this effective process, with conditional rate matrices that can be directly calculated from the original CTBN, and discuss the implications for approximate reasoning in large systems.Comment: Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006

    The Information Bottleneck EM Algorithm

    Full text link
    Learning with hidden variables is a central challenge in probabilistic graphical models that has important implications for many real-life problems. The classical approach is using the Expectation Maximization (EM) algorithm. This algorithm, however, can get trapped in local maxima. In this paper we explore a new approach that is based on the Information Bottleneck principle. In this approach, we view the learning problem as a tradeoff between two information theoretic objectives. The first is to make the hidden variables uninformative about the identity of specific instances. The second is to make the hidden variables informative about the observed attributes. By exploring different tradeoffs between these two objectives, we can gradually converge on a high-scoring solution. As we show, the resulting, Information Bottleneck Expectation Maximization (IB-EM) algorithm, manages to find solutions that are superior to standard EM methods.Comment: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003

    Learning the Dimensionality of Hidden Variables

    Full text link
    A serious problem in learning probabilistic models is the presence of hidden variables. These variables are not observed, yet interact with several of the observed variables. Detecting hidden variables poses two problems: determining the relations to other variables in the model and determining the number of states of the hidden variable. In this paper, we address the latter problem in the context of Bayesian networks. We describe an approach that utilizes a score-based agglomerative state-clustering. As we show, this approach allows us to efficiently evaluate models with a range of cardinalities for the hidden variable. We show how to extend this procedure to deal with multiple interacting hidden variables. We demonstrate the effectiveness of this approach by evaluating it on synthetic and real-life data. We show that our approach learns models with hidden variables that generalize better and have better structure than previous approaches.Comment: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001

    Being Bayesian about Network Structure

    Full text link
    In many domains, we are interested in analyzing the structure of the underlying distribution, e.g., whether one variable is a direct parent of the other. Bayesian model-selection attempts to find the MAP model and use its structure to answer these questions. However, when the amount of available data is modest, there might be many models that have non-negligible posterior. Thus, we want compute the Bayesian posterior of a feature, i.e., the total posterior probability of all models that contain it. In this paper, we propose a new approach for this task. We first show how to efficiently compute a sum over the exponential number of networks that are consistent with a fixed ordering over network variables. This allows us to compute, for a given ordering, both the marginal probability of the data and the posterior of a feature. We then use this result as the basis for an algorithm that approximates the Bayesian posterior of a feature. Our approach uses a Markov Chain Monte Carlo (MCMC) method, but over orderings rather than over network structures. The space of orderings is much smaller and more regular than the space of structures, and has a smoother posterior `landscape'. We present empirical results on synthetic and real-life datasets that compare our approach to full model averaging (when possible), to MCMC over network structures, and to a non-Bayesian bootstrap approach.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000

    On the Sample Complexity of Learning Bayesian Networks

    Full text link
    In recent years there has been an increasing interest in learning Bayesian networks from data. One of the most effective methods for learning such networks is based on the minimum description length (MDL) principle. Previous work has shown that this learning procedure is asymptotically successful: with probability one, it will converge to the target distribution, given a sufficient number of samples. However, the rate of this convergence has been hitherto unknown. In this work we examine the sample complexity of MDL based learning procedures for Bayesian networks. We show that the number of samples needed to learn an epsilon-close approximation (in terms of entropy distance) with confidence delta is O((1/epsilon)^(4/3)log(1/epsilon)log(1/delta)loglog (1/delta)). This means that the sample complexity is a low-order polynomial in the error threshold and sub-linear in the confidence bound. We also discuss how the constants in this term depend on the complexity of the target distribution. Finally, we address questions of asymptotic minimality and propose a method for using the sample complexity results to speed up the learning process.Comment: Appears in Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI1996

    COARA: Code Offloading on Android with AspectJ

    Full text link
    Smartphones suffer from limited computational capabilities and battery life. A method to mitigate these problems is code offloading: executing application code on a remote server. We introduce COARA, a middleware platform for code offloading on Android that uses aspect-oriented programming (AOP) with AspectJ. AOP allows COARA to intercept code for offloading without a customized compiler or modification of the operating system. COARA requires minimal changes to application source code, and does not require the application developer to be aware of AOP. Since state transfer to the server is often a bottleneck that hinders performance, COARA uses AOP to intercept the transmission of large objects from the client and replaces them with object proxies. The server can begin execution of the offloaded application code, regardless of whether all required objects been transferred to the server. We run COARA with Android applications from the Google Play store on a Nexus 4 running unmodified Android 4.3 to prove that our platform improves performance and reduces energy consumption. Our approach yields speedups of 24x and 6x over WiFi and 3G respectively

    Stable regions and singular trajectories in chaotic soft wall billiards

    Full text link
    We present numerical and experimental results for the development of islands of stability in atom-optics billiards with soft walls. As the walls are soften, stable regions appear near singular periodic trajectories in converging (focusing) and dispersing billiards, and are surrounded by areas of "stickiness" in phase-space. The size of these islands depends on the softness of the potential in a very sensitive way

    Atom-Optics Billiards: Non-linear dynamics with cold atoms in optical traps

    Full text link
    We present a new experimental system (the ``atom-optics billiard'') and demonstrate chaotic and regular dynamics of cold, optically trapped atoms. We show that the softness of the walls and additional optical potentials can be used to manipulate the structure of phase space.Comment: Lecture notes from the NATO ASI International Summer School on Chaotic Dynamics and Transport in Classical and Quantum Systems, Cargese, Corsica, August 200

    Modeling Belief in Dynamic Systems, Part II: Revisions and Update

    Full text link
    The study of belief change has been an active area in philosophy and AI. In recent years two special cases of belief change, belief revision and belief update, have been studied in detail. In a companion paper, we introduce a new framework to model belief change. This framework combines temporal and epistemic modalities with a notion of plausibility, allowing us to examine the change of beliefs over time. In this paper, we show how belief revision and belief update can be captured in our framework. This allows us to compare the assumptions made by each method, and to better understand the principles underlying them. In particular, it shows that Katsuno and Mendelzon's notion of belief update depends on several strong assumptions that may limit its applicability in artificial intelligence. Finally, our analysis allow us to identify a notion of minimal change that underlies a broad range of belief change operations including revision and update
    • …
    corecore