5,088 research outputs found
Communication Theoretic Data Analytics
Widespread use of the Internet and social networks invokes the generation of
big data, which is proving to be useful in a number of applications. To deal
with explosively growing amounts of data, data analytics has emerged as a
critical technology related to computing, signal processing, and information
networking. In this paper, a formalism is considered in which data is modeled
as a generalized social network and communication theory and information theory
are thereby extended to data analytics. First, the creation of an equalizer to
optimize information transfer between two data variables is considered, and
financial data is used to demonstrate the advantages. Then, an information
coupling approach based on information geometry is applied for dimensionality
reduction, with a pattern recognition example to illustrate the effectiveness.
These initial trials suggest the potential of communication theoretic data
analytics for a wide range of applications.Comment: Published in IEEE Journal on Selected Areas in Communications, Jan.
201
Performance evaluation of real-time multivariate data reduction models for adaptive-threshold in wireless sensor networks
This paper presents a new metric to assess the performance of different multivariate data reduction models in wireless sensor networks (WSNs). The proposed metric is called Updating Frequency Metric (UFM) which is defined as the frequency of updating the model reference parameters during data collection. A method for estimating the error threshold value during the training phase is also suggested. The proposed threshold of error is used to update the model reference parameters when it is necessary. Numerical analysis and simulation results show that the proposed metric validates its effectiveness in the performance of multivariate data reduction models in terms of the sensor node energy consumption. Furthermore, the proposed adaptive threshold enhances the model's performance more than the non-adaptive threshold in decreasing the frequency of updating the model reference parameters which positively prolongs the lifetime of the node. The adaptive threshold improves the frequency of updating the parameters by 80% and 52% in comparison to the non-adaptive threshold for multivariate data reduction models of MLR-B and PCA-B respectively
Exploitation of Data Correlation and Performance Enhancement in Wireless Sensor Networks
With the combination of wireless communications and embedded system, lots of progress has been made in the area of wireless sensor networks (WSNs). The networks have already been widely deployed, due to their self-organization capacity and low-cost advantage. However, there are still some technical challenges needed to be addressed. In the thesis, three algorithms are proposed in improving network energy efficiency, detecting data fault and reducing data redundancy.
The basic principle behind the proposed algorithms is correlation in the data collected by WSNs. The first sensor scheduling algorithm is based on the spatial correlation between neighbor sensor readings. Given the spatial correlation, sensor nodes are clustered into groups. At each time instance, only one node within each group works as group representative, namely, sensing and transmitting sensor data. Sensor nodes take turns to be group representative. Therefore, the energy consumed by other sensor nodes within the same group can be saved.
Due to the continuous nature of the data to be collected, temporal and spatial correlation of sensor data has been exploited to detect the faulty data. By exploitation of temporal correlation, the normal range of upcoming sensor data can be predicted by the historical observations. Based on spatial correlation, weighted neighbor voting can be used to diagnose whether the value of sensor data is reliable. The status of the sensor data, normal or faulty, is decided by the combination of these two proposed detection procedures.
Similar to the sensor scheduling algorithm, the recursive principal component analysis (RPCA) based algorithm has been studied to detect faulty data and aggregate redundant data by exploitation of spatial correlation as well. The R-PCA model is used to process the sensor data, with the help of squared prediction error (SPE) score and cumulative percentage formula. When SPE score of a collected datum is distinctly larger than that of normal data, faults can be detected. The data dimension is reduced according to the calculation result of cumulative percentage formula. All the algorithms are simulated in OPNET or MATLAB based on practical and synthetic datasets. Performances of the proposed algorithms are evaluated in each chapter
Unsupervised anomaly detection for unlabelled wireless sensor networks data
With the advances in sensor technology, sensor nodes, the tiny yet powerful device are used to collect data from the various domain. As the sensor nodes communicate continuously from the target areas to base station, hundreds of thousands of data are collected to be used for the decision making. Unfortunately, the big amount of unlabeled data collected and stored at the base station. In most cases, data are not reliable due to several reasons. Therefore, this paper will use the unsupervised one-class SVM (OCSVM) to build the anomaly detection schemes for better decision making. Unsupervised OCSVM is preferable to be used in WSNs domain due to the one class of data training is used to build normal reference model. Furthermore, the dimension reduction is used to minimize the resources usage due to resource constraint incurred in WSNs domain. Therefore one of the OCSVM variants namely Centered Hyper-ellipsoidal Support Vector Machine (CESVM) is used as classifier while Candid-Covariance Free Incremental Principal Component Analysis (CCIPCA) algorithm is served as dimension reduction for proposed anomaly detection scheme. Environmental dataset collected from available WSNs data is used to evaluate the performance measures of the proposed scheme. As the results, the proposed scheme shows comparable results for all datasets in term of detection rate, detection accuracy and false alarm rate as compared with other related methods
Transform-based Distributed Data Gathering
A general class of unidirectional transforms is presented that can be
computed in a distributed manner along an arbitrary routing tree. Additionally,
we provide a set of conditions under which these transforms are invertible.
These transforms can be computed as data is routed towards the collection (or
sink) node in the tree and exploit data correlation between nodes in the tree.
Moreover, when used in wireless sensor networks, these transforms can also
leverage data received at nodes via broadcast wireless communications. Various
constructions of unidirectional transforms are also provided for use in data
gathering in wireless sensor networks. New wavelet transforms are also proposed
which provide significant improvements over existing unidirectional transforms
Gravitational Clustering: A Simple, Robust and Adaptive Approach for Distributed Networks
Distributed signal processing for wireless sensor networks enables that
different devices cooperate to solve different signal processing tasks. A
crucial first step is to answer the question: who observes what? Recently,
several distributed algorithms have been proposed, which frame the
signal/object labelling problem in terms of cluster analysis after extracting
source-specific features, however, the number of clusters is assumed to be
known. We propose a new method called Gravitational Clustering (GC) to
adaptively estimate the time-varying number of clusters based on a set of
feature vectors. The key idea is to exploit the physical principle of
gravitational force between mass units: streaming-in feature vectors are
considered as mass units of fixed position in the feature space, around which
mobile mass units are injected at each time instant. The cluster enumeration
exploits the fact that the highest attraction on the mobile mass units is
exerted by regions with a high density of feature vectors, i.e., gravitational
clusters. By sharing estimates among neighboring nodes via a
diffusion-adaptation scheme, cooperative and distributed cluster enumeration is
achieved. Numerical experiments concerning robustness against outliers,
convergence and computational complexity are conducted. The application in a
distributed cooperative multi-view camera network illustrates the applicability
to real-world problems.Comment: 12 pages, 9 figure
- …