21 research outputs found
Kernel-based Inference of Functions over Graphs
The study of networks has witnessed an explosive growth over the past decades
with several ground-breaking methods introduced. A particularly interesting --
and prevalent in several fields of study -- problem is that of inferring a
function defined over the nodes of a network. This work presents a versatile
kernel-based framework for tackling this inference problem that naturally
subsumes and generalizes the reconstruction approaches put forth recently by
the signal processing on graphs community. Both the static and the dynamic
settings are considered along with effective modeling approaches for addressing
real-world problems. The herein analytical discussion is complemented by a set
of numerical examples, which showcase the effectiveness of the presented
techniques, as well as their merits related to state-of-the-art methods.Comment: To be published as a chapter in `Adaptive Learning Methods for
Nonlinear System Modeling', Elsevier Publishing, Eds. D. Comminiello and J.C.
Principe (2018). This chapter surveys recent work on kernel-based inference
of functions over graphs including arXiv:1612.03615 and arXiv:1605.07174 and
arXiv:1711.0930
Recent Trends in Modelling Spatio-Temporal Data
Il lavoro fornisce una disamina delle pi`u recenti metodologie proposte nell’ambito dei modelli spazio-temporali. Nel tentativo di proporre una visione unificata delle metodologie trattate, viene fornita prima una descrizione dei vari tipi di dati spazio-temporali.
Successivamente, si procede con la discussione dei modelli per processi spazialmente continui. La modellistica spazio-temporale `e stata largamente utilizzata per affrontare
problemi in ambito ambientale, geostatistico, idrologico e meteorologico. Questo articolo fornisce una analisi dei metodi correntemente applicati in molte di queste aree
Decentralized Data Fusion and Active Sensing with Mobile Sensors for Modeling and Predicting Spatiotemporal Traffic Phenomena
The problem of modeling and predicting spatiotemporal traffic phenomena over
an urban road network is important to many traffic applications such as
detecting and forecasting congestion hotspots. This paper presents a
decentralized data fusion and active sensing (D2FAS) algorithm for mobile
sensors to actively explore the road network to gather and assimilate the most
informative data for predicting the traffic phenomenon. We analyze the time and
communication complexity of D2FAS and demonstrate that it can scale well with a
large number of observations and sensors. We provide a theoretical guarantee on
its predictive performance to be equivalent to that of a sophisticated
centralized sparse approximation for the Gaussian process (GP) model: The
computation of such a sparse approximate GP model can thus be parallelized and
distributed among the mobile sensors (in a Google-like MapReduce paradigm),
thereby achieving efficient and scalable prediction. We also theoretically
guarantee its active sensing performance that improves under various practical
environmental conditions. Empirical evaluation on real-world urban road network
data shows that our D2FAS algorithm is significantly more time-efficient and
scalable than state-of-the-art centralized algorithms while achieving
comparable predictive performance.Comment: 28th Conference on Uncertainty in Artificial Intelligence (UAI 2012),
Extended version with proofs, 13 page
High-Dimensional Bayesian Geostatistics
With the growing capabilities of Geographic Information Systems (GIS) and
user-friendly software, statisticians today routinely encounter geographically
referenced data containing observations from a large number of spatial
locations and time points. Over the last decade, hierarchical spatiotemporal
process models have become widely deployed statistical tools for researchers to
better understand the complex nature of spatial and temporal variability.
However, fitting hierarchical spatiotemporal models often involves expensive
matrix computations with complexity increasing in cubic order for the number of
spatial locations and temporal points. This renders such models unfeasible for
large data sets. This article offers a focused review of two methods for
constructing well-defined highly scalable spatiotemporal stochastic processes.
Both these processes can be used as "priors" for spatiotemporal random fields.
The first approach constructs a low-rank process operating on a
lower-dimensional subspace. The second approach constructs a Nearest-Neighbor
Gaussian Process (NNGP) that ensures sparse precision matrices for its finite
realizations. Both processes can be exploited as a scalable prior embedded
within a rich hierarchical modeling framework to deliver full Bayesian
inference. These approaches can be described as model-based solutions for big
spatiotemporal datasets. The models ensure that the algorithmic complexity has
floating point operations (flops), where the number of spatial
locations (per iteration). We compare these methods and provide some insight
into their methodological underpinnings
Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets
Spatial process models for analyzing geostatistical data entail computations
that become prohibitive as the number of spatial locations become large. This
manuscript develops a class of highly scalable Nearest Neighbor Gaussian
Process (NNGP) models to provide fully model-based inference for large
geostatistical datasets. We establish that the NNGP is a well-defined spatial
process providing legitimate finite-dimensional Gaussian densities with sparse
precision matrices. We embed the NNGP as a sparsity-inducing prior within a
rich hierarchical modeling framework and outline how computationally efficient
Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or
decomposing large matrices. The floating point operations (flops) per iteration
of this algorithm is linear in the number of spatial locations, thereby
rendering substantial scalability. We illustrate the computational and
inferential benefits of the NNGP over competing methods using simulation
studies and also analyze forest biomass from a massive United States Forest
Inventory dataset at a scale that precludes alternative dimension-reducing
methods
Sensor Data Fusion for Improving Traffic Mobility in Smart Cities
The ever-increasing urban population and vehicular traffic without a corresponding expansion of infrastructure have been a challenge to transportation facilities managers and commuters. While some parts of transportation infrastructure have big data available, so many other locations have sparse data. This has posed a challenge in traffic state estimation and prediction for efficient and effective infrastructure management and route guidance. This research focused on traffic prediction problems and aims to develop novel spatial-temporal and robust algorithms, that can provide high accuracy in the presence of both big data and sparse data in a large urban road network.
Intelligent transportation systems require the knowledge of current traffic state and forecast for effective implementation. The actual traffic state has to be estimated as the existing sensors do not capture the needed state. Sensor measurements often contain missing or incomplete data as a result of communication issues, faulty sensors or cost leading to incomplete monitoring of the entire road network. This missing data pose challenges to traffic estimation approaches. In this work, a robust spatio-temporal traffic imputation approach capable of withstanding high missing data rate is presented. A particle-based approach with Kriging interpolation is proposed. The performance of the particle-based Kriging interpolation for different missing data ratios was investigated for a large road network.
A particle-based framework for dealing with missing data is also proposed. An expression of the likelihood function is derived for the case when the missing value is calculated based on Kriging interpolation. With the Kriging interpolation, the missing values of the measurements are predicted, which are subsequently used in the computation of likelihood terms in the particle filter algorithm.
In the commonly used Kriging approaches, the covariance function depends only on the separation distance irrespective of the traffic at the considered locations. A key limitation of such an approach is its inability to capture well the traffic dynamics and transitions between different states. This thesis proposes a Bayesian Kriging approach for the prediction of urban traffic. The approach can capture these dynamics and model changes via the covariance matrix. The main novelty consists in representing both stationary and non-stationary changes in traffic flows by a discriminative covariance function conditioned on the observation at each location. An advantage is that by considering the surrounding traffic information distinctively, the proposed method is very likely to represent congested regions and interactions in both upstream and downstream areas
EFFICIENT PARAMETRIC AND NON-PARAMETRICLOCALIZATION AND MAPPING IN ROBOTIC NETWORKS
Since the eighties localization and mapping problems have attracted the efforts of robotics researchers. However in the last decade, thanks to the increasing capabilities of the new electronic devices, many new related challenges have been posed, such as swarm robotics, aerial vehicles, autonomous cars and robotics networks. Efficiency, robustness and scalability play a key role in these scenarios.
Efficiency is intended as an ability for an application to minimize the resources usage, in particular CPU time and memory space. In the aforementioned applications an underlying communication network is required so, for robustness we mean asynchronous algorithms resilient to delays and packet-losses. Finally scalability is the ability of an application to continue functioning without any dramatic performance degradation even if the number of devices involved keep increasing.
In this thesis the interest is focused on parametric and non-parametric estimation algorithms ap- plied to localization and mapping in robotics. The main contribution can be summarized in the following four arguments:
(i) Consensus-based localization We address the problem of optimal estimating the position of each agent in a network from relative noisy vectorial distances with its neighbors by means of only local communication and bounded complexity, independent of network size and topology. In particular we propose a consensus-based algorithm with the use of local memory variables which allows asynchronous implementation, has guaranteed exponential convergence to the optimal solution under simple deterministic and randomized communication protocols, and requires minimal packet transmission. In the randomized scenario, we then study the rate of convergence in expectation of the estimation error and we argue that it can be used to obtain upper and lower bound for the rate of converge in mean square. In particular, we show that for regular graphs, such as Cayley, Ramanujan, and complete graphs, the convergence rate in expectation has the same asymptotic degradation of memoryless asynchronous consensus algorithms in terms of network size. In addition, we show that the asynchronous implementation is also robust to delays and communication failures. We finally complement the analytical results with some numerical simulations, comparing the proposed strategy with other algorithms which have been recently proposed in the literature.
(ii) Aerial Vehicles distributed localization: We study the problem of distributed multi- agent localization in presence of heterogeneous measurements and wireless communication. The proposed algorithm integrates low precision global sensors, like GPS and compasses, with more precise relative position (i.e., range plus bearing) sensors. Global sensors are used to reconstruct the absolute position and orientation, while relative sensors are used to retrieve the shape of the formation. A fast distributed and asynchronous linear least-squares algorithm is proposed to solve an approximated version of the non-linear Maximum Likelihood problem. The algorithm is provably shown to be robust to communication losses and random delays. The use of ACK-less broadcast-based communication protocols ensures an efficient and easy implementation in real world scenarios. If the relative measurement errors are sufficiently small, we show that the algorithm attains a solution which is very close to the maximum likelihood solution. The theoretical findings and the algorithm performances are extensively tested by means of Monte-Carlo simulations.
(iii) Estimation and Coverage: We address the problem of optimal coverage of a region via multiple robots when the sensory field used to approximate the density of event appearance is not known in advance. We address this problem in the context of a client-server architecture in which the mobile robots can communicate with a base station via a possibly unreliable wireless network subject to packet losses. Based on Gaussian regression which allows to estimate the true sensory field with any arbitrary accuracy, we propose a randomised strategy in which the robots and the base station simultaneously estimate the true sensory distribution by collecting measurements and compute the corresponding optimal Voronoi partitions. This strategy is designed to promote exploration at the beginning and then smoothly transition to station the robots at the centroid of the estimated optimal Voronoi partitions. Under mild assumptions on the transmission failure probability, we prove that the proposed strategy guarantees the convergence of the estimated sensory field to the true field and that the corresponding Voronoi partitions asymptotically becomes arbitrarily close to an optimal Voronoi partition. Additionally, we also provide numerically efficient approximation that trade-off accuracy of the estimated map for reduced memory and CPU complexity. Finally, we provide a set of extensive simulations which confirm the effectiveness of the proposed approach.
(iv) Non-parametric estimation of spatio-temporal fields: We address the problem of efficiently and optimally estimating an unknown time-varying function through the collection of noisy measurements. We cast our problem in the framework of non-parametric estimation and we assume that the unknown function is generated by a Gaussian process with a known covariance. Under mild assumptions on the kernel function, we propose a solution which links the standard Gaussian regression to the Kalman filtering thanks to the exploitation of a grid where measurements collection and estimation take place. This work show an efficient in time and space method to estimate time-varying function, which combine the advantages of the Gaussian regression, e.g. model-less, and of the Kalman filter, e.g. efficiency
Bayesian Inference for Dynamic Spatio-temporal Models
Spatio-temporal processes are phenomena evolving in space, either by being a point, a field or a map and also they vary in time. A stochastic process may be proposed as a vehicle to infer and hence offer predictions of the future. In this era high dimensional datasets can be available where measurements are observed daily or even hourly at more than one locations along with many predictors. Therefore, what we would like to infer is high dimensional and the analysis is difficult to come through due to high complexity of calculations or efficiency from a computational aspect.
The first Reduced-dimension Dynamic Spatio Temporal Models (DSTMs) were developed to jointly describe the spatial and temporal evolution of a function observed subject to noise. A basic state space model is adopted for the discrete temporal variation, while a continuous autoregressive structure describes the continuous spatial evolution. Application of DTSMs rely upon the pre-selection of a suitable reduced set of basis functions and this can present a challenge in practice.
In this thesis we propose a Hierarchical Bayesian framework for high dimensional spatio-temporal data based upon DTSMs which attempts to resolve this issue allowing the basis to adapt to the observed data. Specifically, we present a wavelet decomposition for the spatial evolution but where one would typically expect parsimony. This believed parsimony can be achieved by placing a Spike and Slab prior distribution on the wavelet coefficients. The aim of using the Spike and Slab prior, is to filter wavelet coefficients with low contribution, and thus achieve the dimension reduction with significant computational savings.
We then propose an Hierarchical Bayesian State-space model, for the estimation of which we offer an appropriate Forward Filtering Backward Sampling algorithm under an MCMC procedure. Then, we extend this model for estimating Poisson counts and Multinomial cell probabilities through proposing a Conditional Particle Filtering framework