30,326 research outputs found
Symbolic Exact Inference for Discrete Probabilistic Programs
The computational burden of probabilistic inference remains a hurdle for
applying probabilistic programming languages to practical problems of interest.
In this work, we provide a semantic and algorithmic foundation for efficient
exact inference on discrete-valued finite-domain imperative probabilistic
programs. We leverage and generalize efficient inference procedures for
Bayesian networks, which exploit the structure of the network to decompose the
inference task, thereby avoiding full path enumeration. To do this, we first
compile probabilistic programs to a symbolic representation. Then we adapt
techniques from the probabilistic logic programming and artificial intelligence
communities in order to perform inference on the symbolic representation. We
formalize our approach, prove it sound, and experimentally validate it against
existing exact and approximate inference techniques. We show that our inference
approach is competitive with inference procedures specialized for Bayesian
networks, thereby expanding the class of probabilistic programs that can be
practically analyzed
Marginal integration for nonparametric causal inference
We consider the problem of inferring the total causal effect of a single
variable intervention on a (response) variable of interest. We propose a
certain marginal integration regression technique for a very general class of
potentially nonlinear structural equation models (SEMs) with known structure,
or at least known superset of adjustment variables: we call the procedure
S-mint regression. We easily derive that it achieves the convergence rate as
for nonparametric regression: for example, single variable intervention effects
can be estimated with convergence rate assuming smoothness with
twice differentiable functions. Our result can also be seen as a major
robustness property with respect to model misspecification which goes much
beyond the notion of double robustness. Furthermore, when the structure of the
SEM is not known, we can estimate (the equivalence class of) the directed
acyclic graph corresponding to the SEM, and then proceed by using S-mint based
on these estimates. We empirically compare the S-mint regression method with
more classical approaches and argue that the former is indeed more robust, more
reliable and substantially simpler.Comment: 40 pages, 14 figure
Active Topology Inference using Network Coding
Our goal is to infer the topology of a network when (i) we can send probes
between sources and receivers at the edge of the network and (ii) intermediate
nodes can perform simple network coding operations, i.e., additions. Our key
intuition is that network coding introduces topology-dependent correlation in
the observations at the receivers, which can be exploited to infer the
topology. For undirected tree topologies, we design hierarchical clustering
algorithms, building on our prior work. For directed acyclic graphs (DAGs),
first we decompose the topology into a number of two-source, two-receiver
(2-by-2) subnetwork components and then we merge these components to
reconstruct the topology. Our approach for DAGs builds on prior work on
tomography, and improves upon it by employing network coding to accurately
distinguish among all different 2-by-2 components. We evaluate our algorithms
through simulation of a number of realistic topologies and compare them to
active tomographic techniques without network coding. We also make connections
between our approach and alternatives, including passive inference, traceroute,
and packet marking
Multi-path Probabilistic Available Bandwidth Estimation through Bayesian Active Learning
Knowing the largest rate at which data can be sent on an end-to-end path such
that the egress rate is equal to the ingress rate with high probability can be
very practical when choosing transmission rates in video streaming or selecting
peers in peer-to-peer applications. We introduce probabilistic available
bandwidth, which is defined in terms of ingress rates and egress rates of
traffic on a path, rather than in terms of capacity and utilization of the
constituent links of the path like the standard available bandwidth metric. In
this paper, we describe a distributed algorithm, based on a probabilistic
graphical model and Bayesian active learning, for simultaneously estimating the
probabilistic available bandwidth of multiple paths through a network. Our
procedure exploits the fact that each packet train provides information not
only about the path it traverses, but also about any path that shares a link
with the monitored path. Simulations and PlanetLab experiments indicate that
this process can dramatically reduce the number of probes required to generate
accurate estimates
- …