41 research outputs found
Data Monitoring and Analysis in Wireless Networks
Various wireless network technologies have been created to meet the ever-increasing demand for wireless access to the Internet, such as wireless local area network, cellular network, sensor network and many more. The communication devices have transformed from large computational servers to small wireless hand-held devices, ranging from laptops, tablets, smartphones to small sensors. The advances of these wireless networks (e.g., faster network speed) and their intensive usages result in an enormous growth of network data in terms of volume, diversity, and complexity. All of these changes have raised complicated issues of network measurement and management.
In the first part of this thesis, I study how WiFi network characteristics impact network forensics investigation and home security monitoring. I first focus on network forensics investigation and propose a wireless forensic monitoring system to collect trace digests of WiFi activities and facilitate cybercrime investigation. Then, I design and develop a low-cost home security system based on WiFi networks for physical intruder detection. Two methods - MAC-based detection and RSSI-variance-based detection, are proposed based on the characteristics of WiFi networks. In the second part, I study how to effectively and efficiently model multiple coevolving time series, which is ubiquitous in network measurement especially in wireless sensor networks. Two comprehensive algorithms are proposed to address three prominent challenges of mining coevolving sensor measured traces: (a) high order; (b) contextual constraints; and (c) temporal smoothness
Simultaneous Measurement Imputation and Outcome Prediction for Achilles Tendon Rupture Rehabilitation
Achilles Tendon Rupture (ATR) is one of the typical soft tissue injuries.
Rehabilitation after such a musculoskeletal injury remains a prolonged process
with a very variable outcome. Accurately predicting rehabilitation outcome is
crucial for treatment decision support. However, it is challenging to train an
automatic method for predicting the ATR rehabilitation outcome from treatment
data, due to a massive amount of missing entries in the data recorded from ATR
patients, as well as complex nonlinear relations between measurements and
outcomes. In this work, we design an end-to-end probabilistic framework to
impute missing data entries and predict rehabilitation outcomes simultaneously.
We evaluate our model on a real-life ATR clinical cohort, comparing with
various baselines. The proposed method demonstrates its clear superiority over
traditional methods which typically perform imputation and prediction in two
separate stages
Probabilistic sequential matrix factorization
We introduce the probabilistic sequential matrix factorization (PSMF) method
for factorizing time-varying and non-stationary datasets consisting of
high-dimensional time-series. In particular, we consider nonlinear Gaussian
state-space models where sequential approximate inference results in the
factorization of a data matrix into a dictionary and time-varying coefficients
with potentially nonlinear Markovian dependencies. The assumed Markovian
structure on the coefficients enables us to encode temporal dependencies into a
low-dimensional feature space. The proposed inference method is solely based on
an approximate extended Kalman filtering scheme, which makes the resulting
method particularly efficient. PSMF can account for temporal nonlinearities
and, more importantly, can be used to calibrate and estimate generic
differentiable nonlinear subspace models. We also introduce a robust version of
PSMF, called rPSMF, which uses Student-t filters to handle model
misspecification. We show that PSMF can be used in multiple contexts: modeling
time series with a periodic subspace, robustifying changepoint detection
methods, and imputing missing data in several high-dimensional time-series,
such as measurements of pollutants across London.Comment: Accepted for publication at AISTATS 202
Networked Time Series Prediction with Incomplete Data
A networked time series (NETS) is a family of time series on a given graph,
one for each node. It has a wide range of applications from intelligent
transportation, environment monitoring to smart grid management. An important
task in such applications is to predict the future values of a NETS based on
its historical values and the underlying graph. Most existing methods require
complete data for training. However, in real-world scenarios, it is not
uncommon to have missing data due to sensor malfunction, incomplete sensing
coverage, etc. In this paper, we study the problem of NETS prediction with
incomplete data. We propose NETS-ImpGAN, a novel deep learning framework that
can be trained on incomplete data with missing values in both history and
future. Furthermore, we propose Graph Temporal Attention Networks, which
incorporate the attention mechanism to capture both inter-time series and
temporal correlations. We conduct extensive experiments on four real-world
datasets under different missing patterns and missing rates. The experimental
results show that NETS-ImpGAN outperforms existing methods, reducing the MAE by
up to 25%
Bayesian Temporal Factorization for Multidimensional Time Series Prediction
Large-scale and multidimensional spatiotemporal data sets are becoming
ubiquitous in many real-world applications such as monitoring urban traffic and
air quality. Making predictions on these time series has become a critical
challenge due to not only the large-scale and high-dimensional nature but also
the considerable amount of missing data. In this paper, we propose a Bayesian
temporal factorization (BTF) framework for modeling multidimensional time
series -- in particular spatiotemporal data -- in the presence of missing
values. By integrating low-rank matrix/tensor factorization and vector
autoregressive (VAR) process into a single probabilistic graphical model, this
framework can characterize both global and local consistencies in large-scale
time series data. The graphical model allows us to effectively perform
probabilistic predictions and produce uncertainty estimates without imputing
those missing values. We develop efficient Gibbs sampling algorithms for model
inference and model updating for real-time prediction and test the proposed BTF
framework on several real-world spatiotemporal data sets for both missing data
imputation and multi-step rolling prediction tasks. The numerical experiments
demonstrate the superiority of the proposed BTF approaches over existing
state-of-the-art methods.Comment: 15 pages, 9 figures, 3 table