22,527 research outputs found

    Detecting Outliers in Data with Correlated Measures

    Full text link
    Advances in sensor technology have enabled the collection of large-scale datasets. Such datasets can be extremely noisy and often contain a significant amount of outliers that result from sensor malfunction or human operation faults. In order to utilize such data for real-world applications, it is critical to detect outliers so that models built from these datasets will not be skewed by outliers. In this paper, we propose a new outlier detection method that utilizes the correlations in the data (e.g., taxi trip distance vs. trip time). Different from existing outlier detection methods, we build a robust regression model that explicitly models the outliers and detects outliers simultaneously with the model fitting. We validate our approach on real-world datasets against methods specifically designed for each dataset as well as the state of the art outlier detectors. Our outlier detection method achieves better performances, demonstrating the robustness and generality of our method. Last, we report interesting case studies on some outliers that result from atypical events.Comment: 10 page

    An intelligent information forwarder for healthcare big data systems with distributed wearable sensors

    Get PDF
    © 2016 IEEE. An increasing number of the elderly population wish to live an independent lifestyle, rather than rely on intrusive care programmes. A big data solution is presented using wearable sensors capable of carrying out continuous monitoring of the elderly, alerting the relevant caregivers when necessary and forwarding pertinent information to a big data system for analysis. A challenge for such a solution is the development of context-awareness through the multidimensional, dynamic and nonlinear sensor readings that have a weak correlation with observable human behaviours and health conditions. To address this challenge, a wearable sensor system with an intelligent data forwarder is discussed in this paper. The forwarder adopts a Hidden Markov Model for human behaviour recognition. Locality sensitive hashing is proposed as an efficient mechanism to learn sensor patterns. A prototype solution is implemented to monitor health conditions of dispersed users. It is shown that the intelligent forwarders can provide the remote sensors with context-awareness. They transmit only important information to the big data server for analytics when certain behaviours happen and avoid overwhelming communication and data storage. The system functions unobtrusively, whilst giving the users peace of mind in the knowledge that their safety is being monitored and analysed

    Secure Distributed Dynamic State Estimation in Wide-Area Smart Grids

    Full text link
    Smart grid is a large complex network with a myriad of vulnerabilities, usually operated in adversarial settings and regulated based on estimated system states. In this study, we propose a novel highly secure distributed dynamic state estimation mechanism for wide-area (multi-area) smart grids, composed of geographically separated subregions, each supervised by a local control center. We firstly propose a distributed state estimator assuming regular system operation, that achieves near-optimal performance based on the local Kalman filters and with the exchange of necessary information between local centers. To enhance the security, we further propose to (i) protect the network database and the network communication channels against attacks and data manipulations via a blockchain (BC)-based system design, where the BC operates on the peer-to-peer network of local centers, (ii) locally detect the measurement anomalies in real-time to eliminate their effects on the state estimation process, and (iii) detect misbehaving (hacked/faulty) local centers in real-time via a distributed trust management scheme over the network. We provide theoretical guarantees regarding the false alarm rates of the proposed detection schemes, where the false alarms can be easily controlled. Numerical studies illustrate that the proposed mechanism offers reliable state estimation under regular system operation, timely and accurate detection of anomalies, and good state recovery performance in case of anomalies

    Optimal Detection of Faulty Traffic Sensors Used in Route Planning

    Full text link
    In a smart city, real-time traffic sensors may be deployed for various applications, such as route planning. Unfortunately, sensors are prone to failures, which result in erroneous traffic data. Erroneous data can adversely affect applications such as route planning, and can cause increased travel time. To minimize the impact of sensor failures, we must detect them promptly and accurately. However, typical detection algorithms may lead to a large number of false positives (i.e., false alarms) and false negatives (i.e., missed detections), which can result in suboptimal route planning. In this paper, we devise an effective detector for identifying faulty traffic sensors using a prediction model based on Gaussian Processes. Further, we present an approach for computing the optimal parameters of the detector which minimize losses due to false-positive and false-negative errors. We also characterize critical sensors, whose failure can have high impact on the route planning application. Finally, we implement our method and evaluate it numerically using a real-world dataset and the route planning platform OpenTripPlanner.Comment: Proceedings of The 2nd Workshop on Science of Smart City Operations and Platforms Engineering (SCOPE 2017), Pittsburgh, PA USA, April 2017, 6 page

    Robust Anomaly Detection in Dynamic Networks

    Get PDF
    We propose two robust methods for anomaly detection in dynamic networks in which the properties of normal traffic are time-varying. We formulate the robust anomaly detection problem as a binary composite hypothesis testing problem and propose two methods: a model-free and a model-based one, leveraging techniques from the theory of large deviations. Both methods require a family of Probability Laws (PLs) that represent normal properties of traffic. We devise a two-step procedure to estimate this family of PLs. We compare the performance of our robust methods and their vanilla counterparts, which assume that normal traffic is stationary, on a network with a diurnal normal pattern and a common anomaly related to data exfiltration. Simulation results show that our robust methods perform better than their vanilla counterparts in dynamic networks.Comment: 6 pages. MED conferenc
    • …
    corecore