15,470 research outputs found
Data Mining to Uncover Heterogeneous Water Use Behaviors From Smart Meter Data
Knowledge on the determinants and patterns of water demand for different consumers supports the design of customized demand management strategies. Smart meters coupled with big data analytics tools create a unique opportunity to support such strategies. Yet, at present, the information content of smart meter data is not fully mined and usually needs to be complemented with water fixture inventory and survey data to achieve detailed customer segmentation based on end use water usage. In this paper, we developed a dataâdriven approach that extracts information on heterogeneous water end use routines, main end use components, and temporal characteristics, only via data mining existing smart meter readings at the scale of individual households. We tested our approach on data from 327 households in Australia, each monitored with smart meters logging water use readings every 5 s. As part of the approach, we first disaggregated the householdâlevel water use time series into different end uses via Autoflow. We then adapted a customer segmentation based on eigenbehavior analysis to discriminate among heterogeneous water end use routines and identify clusters of consumers presenting similar routines. Results revealed three main water end use profile clusters, each characterized by a primary end use: shower, clothes washing, and irrigation. Timeâofâuse and intensityâofâuse differences exist within each class, as well as different characteristics of regularity and periodicity over time. Our customer segmentation analysis approach provides utilities with a concise snapshot of recurrent water use routines from smart meter data and can be used to support customized demand management strategies.TU Berlin, Open-Access-Mittel - 201
Statistical Traffic State Analysis in Large-scale Transportation Networks Using Locality-Preserving Non-negative Matrix Factorization
Statistical traffic data analysis is a hot topic in traffic management and
control. In this field, current research progresses focus on analyzing traffic
flows of individual links or local regions in a transportation network. Less
attention are paid to the global view of traffic states over the entire
network, which is important for modeling large-scale traffic scenes. Our aim is
precisely to propose a new methodology for extracting spatio-temporal traffic
patterns, ultimately for modeling large-scale traffic dynamics, and long-term
traffic forecasting. We attack this issue by utilizing Locality-Preserving
Non-negative Matrix Factorization (LPNMF) to derive low-dimensional
representation of network-level traffic states. Clustering is performed on the
compact LPNMF projections to unveil typical spatial patterns and temporal
dynamics of network-level traffic states. We have tested the proposed method on
simulated traffic data generated for a large-scale road network, and reported
experimental results validate the ability of our approach for extracting
meaningful large-scale space-time traffic patterns. Furthermore, the derived
clustering results provide an intuitive understanding of spatial-temporal
characteristics of traffic flows in the large-scale network, and a basis for
potential long-term forecasting.Comment: IET Intelligent Transport Systems (2013
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
During the 2016 US elections Twitter experienced unprecedented levels of
propaganda and fake news through the collaboration of bots and hired persons,
the ramifications of which are still being debated. This work proposes an
approach to identify the presence of organized behavior in tweets. The Random
Forest, Support Vector Machine, and Logistic Regression algorithms are each
used to train a model with a data set of 850 records consisting of 299 features
extracted from tweets gathered during the 2016 US presidential election. The
features represent user and temporal synchronization characteristics to capture
coordinated behavior. These models are trained to classify tweet sets among the
categories: organic vs organized, political vs non-political, and pro-Trump vs
pro-Hillary vs neither. The random forest algorithm performs better with
greater than 95% average accuracy and f-measure scores for each category. The
most valuable features for classification are identified as user based
features, with media use and marking tweets as favorite to be the most
dominant.Comment: 51 pages, 5 figure
Outlier detection techniques for wireless sensor networks: A survey
In the field of wireless sensor networks, those measurements that significantly deviate from the normal pattern of sensed data are considered as outliers. The potential sources of outliers include noise and errors, events, and malicious attacks on the network. Traditional outlier detection techniques are not directly applicable to wireless sensor networks due to the nature of sensor data and specific requirements and limitations of the wireless sensor networks. This survey provides a comprehensive overview of existing outlier detection techniques specifically developed for the wireless sensor networks. Additionally, it presents a technique-based taxonomy and a comparative table to be used as a guideline to select a technique suitable for the application at hand based on characteristics such as data type, outlier type, outlier identity, and outlier degree
- âŠ