5,839 research outputs found
Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles
We examine a network of learners which address the same classification task
but must learn from different data sets. The learners cannot share data but
instead share their models. Models are shared only one time so as to preserve
the network load. We introduce DELCO (standing for Decentralized Ensemble
Learning with COpulas), a new approach allowing to aggregate the predictions of
the classifiers trained by each learner. The proposed method aggregates the
base classifiers using a probabilistic model relying on Gaussian copulas.
Experiments on logistic regressor ensembles demonstrate competing accuracy and
increased robustness in case of dependent classifiers. A companion python
implementation can be downloaded at https://github.com/john-klein/DELC
Objective Bayes Factors for Gaussian Directed Acyclic Graphical Models
We propose an objective Bayesian method for the comparison of all Gaussian directed acyclic graphical models defined on a given set of variables. The method, which is based on the notion of fractional Bayes factor, requires a single default (typically improper) prior on the space of unconstrained covariance matrices, together with a prior sample size hyper-parameter, which can be set to its minimal value. We show that our approach produces genuine Bayes factors. The implied prior on the concentration matrix of any complete graph is a data-dependent Wishart distribution, and this in turn guarantees that Markov equivalent graphs are scored with the same marginal likelihood. We specialize our results to the smaller class of Gaussian decomposable undirected graphical models, and show that in this case they coincide with those recently obtained using limiting versions of hyper-inverse Wishart distributions as priors on the graph-constrained covariance matrices.Bayes factor; Bayesian model selection; Directed acyclic graph; Exponential family; Fractional Bayes factor; Gaussian graphical model; Objective Bayes;Standard conjugate prior; Structural learning. network; Stochastic search; Structural learning.
Automatic Bayesian Density Analysis
Making sense of a dataset in an automatic and unsupervised fashion is a
challenging problem in statistics and AI. Classical approaches for {exploratory
data analysis} are usually not flexible enough to deal with the uncertainty
inherent to real-world data: they are often restricted to fixed latent
interaction models and homogeneous likelihoods; they are sensitive to missing,
corrupt and anomalous data; moreover, their expressiveness generally comes at
the price of intractable inference. As a result, supervision from statisticians
is usually needed to find the right model for the data. However, since domain
experts are not necessarily also experts in statistics, we propose Automatic
Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible
at large. Specifically, ABDA allows for automatic and efficient missing value
estimation, statistical data type and likelihood discovery, anomaly detection
and dependency structure mining, on top of providing accurate density
estimation. Extensive empirical evidence shows that ABDA is a suitable tool for
automatic exploratory analysis of mixed continuous and discrete tabular data.Comment: In proceedings of the Thirty-Third AAAI Conference on Artificial
Intelligence (AAAI-19
Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning
We propose a new sampling method, the thermostat-assisted
continuously-tempered Hamiltonian Monte Carlo, for Bayesian learning on large
datasets and multimodal distributions. It simulates the Nos\'e-Hoover dynamics
of a continuously-tempered Hamiltonian system built on the distribution of
interest. A significant advantage of this method is that it is not only able to
efficiently draw representative i.i.d. samples when the distribution contains
multiple isolated modes, but capable of adaptively neutralising the noise
arising from mini-batches and maintaining accurate sampling. While the
properties of this method have been studied using synthetic distributions,
experiments on three real datasets also demonstrated the gain of performance
over several strong baselines with various types of neural networks plunged in
- …