5,088 research outputs found

    Communication Theoretic Data Analytics

    Full text link
    Widespread use of the Internet and social networks invokes the generation of big data, which is proving to be useful in a number of applications. To deal with explosively growing amounts of data, data analytics has emerged as a critical technology related to computing, signal processing, and information networking. In this paper, a formalism is considered in which data is modeled as a generalized social network and communication theory and information theory are thereby extended to data analytics. First, the creation of an equalizer to optimize information transfer between two data variables is considered, and financial data is used to demonstrate the advantages. Then, an information coupling approach based on information geometry is applied for dimensionality reduction, with a pattern recognition example to illustrate the effectiveness. These initial trials suggest the potential of communication theoretic data analytics for a wide range of applications.Comment: Published in IEEE Journal on Selected Areas in Communications, Jan. 201

    Performance evaluation of real-time multivariate data reduction models for adaptive-threshold in wireless sensor networks

    Get PDF
    This paper presents a new metric to assess the performance of different multivariate data reduction models in wireless sensor networks (WSNs). The proposed metric is called Updating Frequency Metric (UFM) which is defined as the frequency of updating the model reference parameters during data collection. A method for estimating the error threshold value during the training phase is also suggested. The proposed threshold of error is used to update the model reference parameters when it is necessary. Numerical analysis and simulation results show that the proposed metric validates its effectiveness in the performance of multivariate data reduction models in terms of the sensor node energy consumption. Furthermore, the proposed adaptive threshold enhances the model's performance more than the non-adaptive threshold in decreasing the frequency of updating the model reference parameters which positively prolongs the lifetime of the node. The adaptive threshold improves the frequency of updating the parameters by 80% and 52% in comparison to the non-adaptive threshold for multivariate data reduction models of MLR-B and PCA-B respectively

    Exploitation of Data Correlation and Performance Enhancement in Wireless Sensor Networks

    Get PDF
    With the combination of wireless communications and embedded system, lots of progress has been made in the area of wireless sensor networks (WSNs). The networks have already been widely deployed, due to their self-organization capacity and low-cost advantage. However, there are still some technical challenges needed to be addressed. In the thesis, three algorithms are proposed in improving network energy efficiency, detecting data fault and reducing data redundancy. The basic principle behind the proposed algorithms is correlation in the data collected by WSNs. The first sensor scheduling algorithm is based on the spatial correlation between neighbor sensor readings. Given the spatial correlation, sensor nodes are clustered into groups. At each time instance, only one node within each group works as group representative, namely, sensing and transmitting sensor data. Sensor nodes take turns to be group representative. Therefore, the energy consumed by other sensor nodes within the same group can be saved. Due to the continuous nature of the data to be collected, temporal and spatial correlation of sensor data has been exploited to detect the faulty data. By exploitation of temporal correlation, the normal range of upcoming sensor data can be predicted by the historical observations. Based on spatial correlation, weighted neighbor voting can be used to diagnose whether the value of sensor data is reliable. The status of the sensor data, normal or faulty, is decided by the combination of these two proposed detection procedures. Similar to the sensor scheduling algorithm, the recursive principal component analysis (RPCA) based algorithm has been studied to detect faulty data and aggregate redundant data by exploitation of spatial correlation as well. The R-PCA model is used to process the sensor data, with the help of squared prediction error (SPE) score and cumulative percentage formula. When SPE score of a collected datum is distinctly larger than that of normal data, faults can be detected. The data dimension is reduced according to the calculation result of cumulative percentage formula. All the algorithms are simulated in OPNET or MATLAB based on practical and synthetic datasets. Performances of the proposed algorithms are evaluated in each chapter

    Unsupervised anomaly detection for unlabelled wireless sensor networks data

    Get PDF
    With the advances in sensor technology, sensor nodes, the tiny yet powerful device are used to collect data from the various domain. As the sensor nodes communicate continuously from the target areas to base station, hundreds of thousands of data are collected to be used for the decision making. Unfortunately, the big amount of unlabeled data collected and stored at the base station. In most cases, data are not reliable due to several reasons. Therefore, this paper will use the unsupervised one-class SVM (OCSVM) to build the anomaly detection schemes for better decision making. Unsupervised OCSVM is preferable to be used in WSNs domain due to the one class of data training is used to build normal reference model. Furthermore, the dimension reduction is used to minimize the resources usage due to resource constraint incurred in WSNs domain. Therefore one of the OCSVM variants namely Centered Hyper-ellipsoidal Support Vector Machine (CESVM) is used as classifier while Candid-Covariance Free Incremental Principal Component Analysis (CCIPCA) algorithm is served as dimension reduction for proposed anomaly detection scheme. Environmental dataset collected from available WSNs data is used to evaluate the performance measures of the proposed scheme. As the results, the proposed scheme shows comparable results for all datasets in term of detection rate, detection accuracy and false alarm rate as compared with other related methods

    Transform-based Distributed Data Gathering

    Full text link
    A general class of unidirectional transforms is presented that can be computed in a distributed manner along an arbitrary routing tree. Additionally, we provide a set of conditions under which these transforms are invertible. These transforms can be computed as data is routed towards the collection (or sink) node in the tree and exploit data correlation between nodes in the tree. Moreover, when used in wireless sensor networks, these transforms can also leverage data received at nodes via broadcast wireless communications. Various constructions of unidirectional transforms are also provided for use in data gathering in wireless sensor networks. New wavelet transforms are also proposed which provide significant improvements over existing unidirectional transforms

    Gravitational Clustering: A Simple, Robust and Adaptive Approach for Distributed Networks

    Full text link
    Distributed signal processing for wireless sensor networks enables that different devices cooperate to solve different signal processing tasks. A crucial first step is to answer the question: who observes what? Recently, several distributed algorithms have been proposed, which frame the signal/object labelling problem in terms of cluster analysis after extracting source-specific features, however, the number of clusters is assumed to be known. We propose a new method called Gravitational Clustering (GC) to adaptively estimate the time-varying number of clusters based on a set of feature vectors. The key idea is to exploit the physical principle of gravitational force between mass units: streaming-in feature vectors are considered as mass units of fixed position in the feature space, around which mobile mass units are injected at each time instant. The cluster enumeration exploits the fact that the highest attraction on the mobile mass units is exerted by regions with a high density of feature vectors, i.e., gravitational clusters. By sharing estimates among neighboring nodes via a diffusion-adaptation scheme, cooperative and distributed cluster enumeration is achieved. Numerical experiments concerning robustness against outliers, convergence and computational complexity are conducted. The application in a distributed cooperative multi-view camera network illustrates the applicability to real-world problems.Comment: 12 pages, 9 figure
    corecore