30,326 research outputs found

    Symbolic Exact Inference for Discrete Probabilistic Programs

    Full text link
    The computational burden of probabilistic inference remains a hurdle for applying probabilistic programming languages to practical problems of interest. In this work, we provide a semantic and algorithmic foundation for efficient exact inference on discrete-valued finite-domain imperative probabilistic programs. We leverage and generalize efficient inference procedures for Bayesian networks, which exploit the structure of the network to decompose the inference task, thereby avoiding full path enumeration. To do this, we first compile probabilistic programs to a symbolic representation. Then we adapt techniques from the probabilistic logic programming and artificial intelligence communities in order to perform inference on the symbolic representation. We formalize our approach, prove it sound, and experimentally validate it against existing exact and approximate inference techniques. We show that our inference approach is competitive with inference procedures specialized for Bayesian networks, thereby expanding the class of probabilistic programs that can be practically analyzed

    Marginal integration for nonparametric causal inference

    Full text link
    We consider the problem of inferring the total causal effect of a single variable intervention on a (response) variable of interest. We propose a certain marginal integration regression technique for a very general class of potentially nonlinear structural equation models (SEMs) with known structure, or at least known superset of adjustment variables: we call the procedure S-mint regression. We easily derive that it achieves the convergence rate as for nonparametric regression: for example, single variable intervention effects can be estimated with convergence rate n2/5n^{-2/5} assuming smoothness with twice differentiable functions. Our result can also be seen as a major robustness property with respect to model misspecification which goes much beyond the notion of double robustness. Furthermore, when the structure of the SEM is not known, we can estimate (the equivalence class of) the directed acyclic graph corresponding to the SEM, and then proceed by using S-mint based on these estimates. We empirically compare the S-mint regression method with more classical approaches and argue that the former is indeed more robust, more reliable and substantially simpler.Comment: 40 pages, 14 figure

    Active Topology Inference using Network Coding

    Get PDF
    Our goal is to infer the topology of a network when (i) we can send probes between sources and receivers at the edge of the network and (ii) intermediate nodes can perform simple network coding operations, i.e., additions. Our key intuition is that network coding introduces topology-dependent correlation in the observations at the receivers, which can be exploited to infer the topology. For undirected tree topologies, we design hierarchical clustering algorithms, building on our prior work. For directed acyclic graphs (DAGs), first we decompose the topology into a number of two-source, two-receiver (2-by-2) subnetwork components and then we merge these components to reconstruct the topology. Our approach for DAGs builds on prior work on tomography, and improves upon it by employing network coding to accurately distinguish among all different 2-by-2 components. We evaluate our algorithms through simulation of a number of realistic topologies and compare them to active tomographic techniques without network coding. We also make connections between our approach and alternatives, including passive inference, traceroute, and packet marking

    Multi-path Probabilistic Available Bandwidth Estimation through Bayesian Active Learning

    Full text link
    Knowing the largest rate at which data can be sent on an end-to-end path such that the egress rate is equal to the ingress rate with high probability can be very practical when choosing transmission rates in video streaming or selecting peers in peer-to-peer applications. We introduce probabilistic available bandwidth, which is defined in terms of ingress rates and egress rates of traffic on a path, rather than in terms of capacity and utilization of the constituent links of the path like the standard available bandwidth metric. In this paper, we describe a distributed algorithm, based on a probabilistic graphical model and Bayesian active learning, for simultaneously estimating the probabilistic available bandwidth of multiple paths through a network. Our procedure exploits the fact that each packet train provides information not only about the path it traverses, but also about any path that shares a link with the monitored path. Simulations and PlanetLab experiments indicate that this process can dramatically reduce the number of probes required to generate accurate estimates
    corecore