Problems in Signal Processing and Inference on Graphs

Abstract

Modern datasets are often massive due to the sharp decrease in the cost of collecting and storing data. Many are endowed with relational structure modeled by a graph, an object comprising a set of points and a set of pairwise connections between them. A ``signal on a graph'' has elements related to each other through a graph---it could model, for example, measurements from a sensor network. In this dissertation we study several problems in signal processing and inference on graphs. We begin by introducing an analogue to Heisenberg's time-frequency uncertainty principle for signals on graphs. We use spectral graph theory and the standard extension of Fourier analysis to graphs. Our spectral graph uncertainty principle makes precise the notion that a highly localized signal on a graph must have a broad spectrum, and vice versa. Next, we consider the problem of detecting a random walk on a graph from noisy observations. We characterize the performance of the optimal detector through the (type-II) error exponent, borrowing techniques from statistical physics to develop a lower bound exhibiting a phase transition. Strong performance is only guaranteed when the signal to noise ratio exceeds twice the random walk's entropy rate. Monte Carlo simulations show that the lower bound is quite close to the true exponent. Next, we introduce a technique for inferring the source of an epidemic from observations at a few nodes. We develop a Monte Carlo technique to simulate the infection process, and use statistics computed from these simulations to approximate the likelihood, which we then maximize to locate the source. We further introduce a logistic autoregressive model (ALARM), a simple model for binary processes on graphs that can still capture a variety of behavior. We demonstrate its simplicity by showing how to easily infer the underlying graph structure from measurements; a technique versatile enough that it can work under model mismatch. Finally, we introduce the exact formula for the error of the randomized Kaczmarz algorithm, a linear system solver for sparse systems, which often arise in graph theory. This is important because, as we show, existing performance bounds are quite loose.Engineering and Applied Sciences - Engineering Science

    Similar works