8,913 research outputs found
Learning Large-Scale Bayesian Networks with the sparsebn Package
Learning graphical models from data is an important problem with wide
applications, ranging from genomics to the social sciences. Nowadays datasets
often have upwards of thousands---sometimes tens or hundreds of thousands---of
variables and far fewer samples. To meet this challenge, we have developed a
new R package called sparsebn for learning the structure of large, sparse
graphical models with a focus on Bayesian networks. While there are many
existing software packages for this task, this package focuses on the unique
setting of learning large networks from high-dimensional data, possibly with
interventions. As such, the methods provided place a premium on scalability and
consistency in a high-dimensional setting. Furthermore, in the presence of
interventions, the methods implemented here achieve the goal of learning a
causal network from data. Additionally, the sparsebn package is fully
compatible with existing software packages for network analysis.Comment: To appear in the Journal of Statistical Software, 39 pages, 7 figure
Telling Cause from Effect using MDL-based Local and Global Regression
We consider the fundamental problem of inferring the causal direction between
two univariate numeric random variables and from observational data.
The two-variable case is especially difficult to solve since it is not possible
to use standard conditional independence tests between the variables.
To tackle this problem, we follow an information theoretic approach based on
Kolmogorov complexity and use the Minimum Description Length (MDL) principle to
provide a practical solution. In particular, we propose a compression scheme to
encode local and global functional relations using MDL-based regression. We
infer causes in case it is shorter to describe as a function of
than the inverse direction. In addition, we introduce Slope, an efficient
linear-time algorithm that through thorough empirical evaluation on both
synthetic and real world data we show outperforms the state of the art by a
wide margin.Comment: 10 pages, To appear in ICDM1
Faster Rates for Policy Learning
This article improves the existing proven rates of regret decay in optimal
policy estimation. We give a margin-free result showing that the regret decay
for estimating a within-class optimal policy is second-order for empirical risk
minimizers over Donsker classes, with regret decaying at a faster rate than the
standard error of an efficient estimator of the value of an optimal policy. We
also give a result from the classification literature that shows that faster
regret decay is possible via plug-in estimation provided a margin condition
holds. Four examples are considered. In these examples, the regret is expressed
in terms of either the mean value or the median value; the number of possible
actions is either two or finitely many; and the sampling scheme is either
independent and identically distributed or sequential, where the latter
represents a contextual bandit sampling scheme
Online Causal Structure Learning in the Presence of Latent Variables
We present two online causal structure learning algorithms which can track
changes in a causal structure and process data in a dynamic real-time manner.
Standard causal structure learning algorithms assume that causal structure does
not change during the data collection process, but in real-world scenarios, it
does often change. Therefore, it is inappropriate to handle such changes with
existing batch-learning approaches, and instead, a structure should be learned
in an online manner. The online causal structure learning algorithms we present
here can revise correlation values without reprocessing the entire dataset and
use an existing model to avoid relearning the causal links in the prior model,
which still fit data. Proposed algorithms are tested on synthetic and
real-world datasets, the latter being a seasonally adjusted commodity price
index dataset for the U.S. The online causal structure learning algorithms
outperformed standard FCI by a large margin in learning the changed causal
structure correctly and efficiently when latent variables were present.Comment: 16 pages, 9 figures, 2 table
Distributed stochastic optimization via matrix exponential learning
In this paper, we investigate a distributed learning scheme for a broad class
of stochastic optimization problems and games that arise in signal processing
and wireless communications. The proposed algorithm relies on the method of
matrix exponential learning (MXL) and only requires locally computable gradient
observations that are possibly imperfect and/or obsolete. To analyze it, we
introduce the notion of a stable Nash equilibrium and we show that the
algorithm is globally convergent to such equilibria - or locally convergent
when an equilibrium is only locally stable. We also derive an explicit linear
bound for the algorithm's convergence speed, which remains valid under
measurement errors and uncertainty of arbitrarily high variance. To validate
our theoretical analysis, we test the algorithm in realistic
multi-carrier/multiple-antenna wireless scenarios where several users seek to
maximize their energy efficiency. Our results show that learning allows users
to attain a net increase between 100% and 500% in energy efficiency, even under
very high uncertainty.Comment: 31 pages, 3 figure
Probabilistic Hybrid Action Models for Predicting Concurrent Percept-driven Robot Behavior
This article develops Probabilistic Hybrid Action Models (PHAMs), a realistic
causal model for predicting the behavior generated by modern percept-driven
robot plans. PHAMs represent aspects of robot behavior that cannot be
represented by most action models used in AI planning: the temporal structure
of continuous control processes, their non-deterministic effects, several modes
of their interferences, and the achievement of triggering conditions in
closed-loop robot plans.
The main contributions of this article are: (1) PHAMs, a model of concurrent
percept-driven behavior, its formalization, and proofs that the model generates
probably, qualitatively accurate predictions; and (2) a resource-efficient
inference method for PHAMs based on sampling projections from probabilistic
action models and state descriptions. We show how PHAMs can be applied to
planning the course of action of an autonomous robot office courier based on
analytical and experimental results
- …