Search CORE

14,446 research outputs found

Bayesian Semi-supervised Learning with Graph Gaussian Processes

Author: Colombo Nicolo
Ng Yin Cheng
Silva Ricardo
Publication venue
Publication date: 12/10/2018
Field of study

We propose a data-efficient Gaussian process-based Bayesian approach to the semi-supervised learning problem on graphs. The proposed model shows extremely competitive performance when compared to the state-of-the-art graph neural networks on semi-supervised learning benchmark experiments, and outperforms the neural networks in active learning experiments where labels are scarce. Furthermore, the model does not require a validation data set for early stopping to control over-fitting. Our model can be viewed as an instance of empirical distribution regression weighted locally by network connectivity. We further motivate the intuitive construction of the model with a Bayesian linear model interpretation where the node features are filtered by an operator related to the graph Laplacian. The method can be easily implemented by adapting off-the-shelf scalable variational inference algorithms for Gaussian processes.Comment: To appear in NIPS 2018 Fixed an error in Figure 2. The previous arxiv version contains two identical sub-figure

arXiv.org e-Print Archive

UCL Discovery

In-Network Outlier Detection in Wireless Sensor Networks

Author: A Beck
A Cerpa
Boleslaw Szymanski
Chris Giannella
D Apiletti
D Krivitski
G Tietjen
H Fan
Hillol Kargupta
IF Akyildiz
IF Akyildiz
Joel W. Branch
K Bhaduri
K Das
K Holger
L Chen
M Bawa
M Mehyar
M Otey
P Gupta
R Wolff
R Wolff
Ran Wolff
S Basu
S Chong
S Mukherjee
V Barnett
V Hodge
W Mebane
X Sheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/09/2009
Field of study

To address the problem of unsupervised outlier detection in wireless sensor networks, we develop an approach that (1) is flexible with respect to the outlier definition, (2) computes the result in-network to reduce both bandwidth and energy usage,(3) only uses single hop communication thus permitting very simple node failure detection and message reliability assurance mechanisms (e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data. We examine performance using simulation with real sensor data streams. Our results demonstrate that our approach is accurate and imposes a reasonable communication load and level of power consumption.Comment: Extended version of a paper appearing in the Int'l Conference on Distributed Computing Systems 200

arXiv.org e-Print Archive

Crossref

Early Accurate Results for Advanced Analytics on MapReduce

Author: Laptev Nikolay
Zaniolo Carlo
Zeng Kai
Publication venue
Publication date: 01/01/2012
Field of study

Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in MapReduce-oriented systems although these are intended for `big data'. Therefore, we proposed and implemented a non-parametric extension of Hadoop which allows the incremental computation of early results for arbitrary work-flows, along with reliable on-line estimates of the degree of accuracy achieved so far in the computation. These estimates are based on a technique called bootstrapping that has been widely employed in statistics and can be applied to arbitrary functions and data distributions. In this paper, we describe our Early Accurate Result Library (EARL) for Hadoop that was designed to minimize the changes required to the MapReduce framework. Various tests of EARL of Hadoop are presented to characterize the frequent situations where EARL can provide major speed-ups over the current version of Hadoop.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX