22,527 research outputs found
Detecting Outliers in Data with Correlated Measures
Advances in sensor technology have enabled the collection of large-scale
datasets. Such datasets can be extremely noisy and often contain a significant
amount of outliers that result from sensor malfunction or human operation
faults. In order to utilize such data for real-world applications, it is
critical to detect outliers so that models built from these datasets will not
be skewed by outliers.
In this paper, we propose a new outlier detection method that utilizes the
correlations in the data (e.g., taxi trip distance vs. trip time). Different
from existing outlier detection methods, we build a robust regression model
that explicitly models the outliers and detects outliers simultaneously with
the model fitting.
We validate our approach on real-world datasets against methods specifically
designed for each dataset as well as the state of the art outlier detectors.
Our outlier detection method achieves better performances, demonstrating the
robustness and generality of our method. Last, we report interesting case
studies on some outliers that result from atypical events.Comment: 10 page
An intelligent information forwarder for healthcare big data systems with distributed wearable sensors
© 2016 IEEE. An increasing number of the elderly population wish to live an independent lifestyle, rather than rely on intrusive care programmes. A big data solution is presented using wearable sensors capable of carrying out continuous monitoring of the elderly, alerting the relevant caregivers when necessary and forwarding pertinent information to a big data system for analysis. A challenge for such a solution is the development of context-awareness through the multidimensional, dynamic and nonlinear sensor readings that have a weak correlation with observable human behaviours and health conditions. To address this challenge, a wearable sensor system with an intelligent data forwarder is discussed in this paper. The forwarder adopts a Hidden Markov Model for human behaviour recognition. Locality sensitive hashing is proposed as an efficient mechanism to learn sensor patterns. A prototype solution is implemented to monitor health conditions of dispersed users. It is shown that the intelligent forwarders can provide the remote sensors with context-awareness. They transmit only important information to the big data server for analytics when certain behaviours happen and avoid overwhelming communication and data storage. The system functions unobtrusively, whilst giving the users peace of mind in the knowledge that their safety is being monitored and analysed
Secure Distributed Dynamic State Estimation in Wide-Area Smart Grids
Smart grid is a large complex network with a myriad of vulnerabilities,
usually operated in adversarial settings and regulated based on estimated
system states. In this study, we propose a novel highly secure distributed
dynamic state estimation mechanism for wide-area (multi-area) smart grids,
composed of geographically separated subregions, each supervised by a local
control center. We firstly propose a distributed state estimator assuming
regular system operation, that achieves near-optimal performance based on the
local Kalman filters and with the exchange of necessary information between
local centers. To enhance the security, we further propose to (i) protect the
network database and the network communication channels against attacks and
data manipulations via a blockchain (BC)-based system design, where the BC
operates on the peer-to-peer network of local centers, (ii) locally detect the
measurement anomalies in real-time to eliminate their effects on the state
estimation process, and (iii) detect misbehaving (hacked/faulty) local centers
in real-time via a distributed trust management scheme over the network. We
provide theoretical guarantees regarding the false alarm rates of the proposed
detection schemes, where the false alarms can be easily controlled. Numerical
studies illustrate that the proposed mechanism offers reliable state estimation
under regular system operation, timely and accurate detection of anomalies, and
good state recovery performance in case of anomalies
Optimal Detection of Faulty Traffic Sensors Used in Route Planning
In a smart city, real-time traffic sensors may be deployed for various
applications, such as route planning. Unfortunately, sensors are prone to
failures, which result in erroneous traffic data. Erroneous data can adversely
affect applications such as route planning, and can cause increased travel
time. To minimize the impact of sensor failures, we must detect them promptly
and accurately. However, typical detection algorithms may lead to a large
number of false positives (i.e., false alarms) and false negatives (i.e.,
missed detections), which can result in suboptimal route planning. In this
paper, we devise an effective detector for identifying faulty traffic sensors
using a prediction model based on Gaussian Processes. Further, we present an
approach for computing the optimal parameters of the detector which minimize
losses due to false-positive and false-negative errors. We also characterize
critical sensors, whose failure can have high impact on the route planning
application. Finally, we implement our method and evaluate it numerically using
a real-world dataset and the route planning platform OpenTripPlanner.Comment: Proceedings of The 2nd Workshop on Science of Smart City Operations
and Platforms Engineering (SCOPE 2017), Pittsburgh, PA USA, April 2017, 6
page
Robust Anomaly Detection in Dynamic Networks
We propose two robust methods for anomaly detection in dynamic networks in
which the properties of normal traffic are time-varying. We formulate the
robust anomaly detection problem as a binary composite hypothesis testing
problem and propose two methods: a model-free and a model-based one, leveraging
techniques from the theory of large deviations. Both methods require a family
of Probability Laws (PLs) that represent normal properties of traffic. We
devise a two-step procedure to estimate this family of PLs. We compare the
performance of our robust methods and their vanilla counterparts, which assume
that normal traffic is stationary, on a network with a diurnal normal pattern
and a common anomaly related to data exfiltration. Simulation results show that
our robust methods perform better than their vanilla counterparts in dynamic
networks.Comment: 6 pages. MED conferenc
- …