124 research outputs found

    Outlier Mining Methods Based on Graph Structure Analysis

    Get PDF
    Outlier detection in high-dimensional datasets is a fundamental and challenging problem across disciplines that has also practical implications, as removing outliers from the training set improves the performance of machine learning algorithms. While many outlier mining algorithms have been proposed in the literature, they tend to be valid or efficient for specific types of datasets (time series, images, videos, etc.). Here we propose two methods that can be applied to generic datasets, as long as there is a meaningful measure of distance between pairs of elements of the dataset. Both methods start by defining a graph, where the nodes are the elements of the dataset, and the links have associated weights that are the distances between the nodes. Then, the first method assigns an outlier score based on the percolation (i.e., the fragmentation) of the graph. The second method uses the popular IsoMap non-linear dimensionality reduction algorithm, and assigns an outlier score by comparing the geodesic distances with the distances in the reduced space. We test these algorithms on real and synthetic datasets and show that they either outperform, or perform on par with other popular outlier detection methods. A main advantage of the percolation method is that is parameter free and therefore, it does not require any training; on the other hand, the IsoMap method has two integer number parameters, and when they are appropriately selected, the method performs similar to or better than all the other methods tested.Peer ReviewedPostprint (published version

    Network Topology Mapping from Partial Virtual Coordinates and Graph Geodesics

    Full text link
    For many important network types (e.g., sensor networks in complex harsh environments and social networks) physical coordinate systems (e.g., Cartesian), and physical distances (e.g., Euclidean), are either difficult to discern or inapplicable. Accordingly, coordinate systems and characterizations based on hop-distance measurements, such as Topology Preserving Maps (TPMs) and Virtual-Coordinate (VC) systems are attractive alternatives to Cartesian coordinates for many network algorithms. Herein, we present an approach to recover geometric and topological properties of a network with a small set of distance measurements. In particular, our approach is a combination of shortest path (often called geodesic) recovery concepts and low-rank matrix completion, generalized to the case of hop-distances in graphs. Results for sensor networks embedded in 2-D and 3-D spaces, as well as a social networks, indicates that the method can accurately capture the network connectivity with a small set of measurements. TPM generation can now also be based on various context appropriate measurements or VC systems, as long as they characterize different nodes by distances to small sets of random nodes (instead of a set of global anchors). The proposed method is a significant generalization that allows the topology to be extracted from a random set of graph shortest paths, making it applicable in contexts such as social networks where VC generation may not be possible.Comment: 17 pages, 9 figures. arXiv admin note: substantial text overlap with arXiv:1712.1006

    Localization in wireless sensor networks with gradient descent

    Get PDF
    In this article, we present two distance-based sensor network localization algorithms. The location of the sensors is unknown initially and we can estimate the relative locations of sensors by using knowledge of inter-sensor distance measurements. Together with the knowledge of the absolute locations of three or more sensors, we can also determine the locations of all the sensors in the wireless network. The proposed algorithms make use of gradient descent to achieve excellent localization accuracy. The two gradient descent algorithms are iterative in nature and result is obtained when there is no further improvement on the accuracy. Simulation results have shown that the proposed algorithms have better performance than existing localization algorithms. A comparison of different methods is given in the paper. © 2011 IEEE.published_or_final_versionThe 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PacRim), Victoria, B.C., 23-26 August 2011. In IEEE PacRim Conference Proceedings, 2011, p. 91-9

    An iteratively Reweighted Least Square algorithm for RSS-based sensor network localization

    Get PDF
    In this article, we give a new algorithm for localization based on RSS measurement. There are many measurement methods for localizing the unknown nodes in a sensor network. RSS is the most popular one due to its simple and cheap hardware requirement. However, accurate algorithm based on RSS is needed to obtain the positions of unknown nodes. Recent algorithms such as MDS(Multi-Dimensional Scaling)-MAP, PDM (Proximity Distance Matrix) cannot give accurate results based on RSS as the RSS signals always have large variations. Besides, recent algorithms on sensor network localization ignore the received signal strength (RSS) and thus get a disappointing accuracy. This is because they are mostly focused on the difference between the estimated distance and the real distance. This paper introduces a target function - signal-based maximum likelihood (SML), which uses the maximum likelihood based on the directly measured RSS signal. Inspired by the SMACOF (Scaling by Majorizing A COmplicated Function) algorithm, an iteration surrogate algorithm named IRLS (Iteratively Reweighted Least Square) is introduced to solve the SML. From the simulation results, the IRLS algorithm can give accurate results for RSS positioning. When compared with other popular algorithms such as MDS-MAP, PDM, and SMACOF, the error (distance between the estimated position and the actual position) calculated by IRLS is less than all the other algorithms. In anisotropic network, IRLS also has good performance. © 2011 IEEE.published_or_final_versionThe 2011 IEEE International Conference on Mechatronics and Automation (ICMA 2011), Beijing, China, 7-10 August 2011. In Proceedings of ICMA, 2011, p. 1085-109

    Supervising Embedding Algorithms Using the Stress

    Full text link
    While classical scaling, just like principal component analysis, is parameter-free, most other methods for embedding multivariate data require the selection of one or several parameters. This tuning can be difficult due to the unsupervised nature of the situation. We propose a simple, almost obvious, approach to supervise the choice of tuning parameter(s): minimize a notion of stress. We substantiate this choice by reference to rigidity theory. We extend a result by Aspnes et al. (IEEE Mobile Computing, 2006), showing that general random geometric graphs are trilateration graphs with high probability. And we provide a stability result \`a la Anderson et al. (SIAM Discrete Mathematics, 2010). We illustrate this approach in the context of the MDS-MAP(P) algorithm of Shang and Ruml (IEEE INFOCOM, 2004). As a prototypical patch-stitching method, it requires the choice of patch size, and we use the stress to make that choice data-driven. In this context, we perform a number of experiments to illustrate the validity of using the stress as the basis for tuning parameter selection. In so doing, we uncover a bias-variance tradeoff, which is a phenomenon which may have been overlooked in the multidimensional scaling literature. By turning MDS-MAP(P) into a method for manifold learning, we obtain a local version of Isomap for which the minimization of the stress may also be used for parameter tuning

    HEA-Loc: A robust localization algorithm for sensor networks of diversified topologies

    Get PDF
    In recent years, localization in a variety of Wireless Sensor Networks (WSNs) is a compelling but elusive goal. Several algorithms that use different methodologies have been proposed to achieve this goal. The performances of these algorithms depend on several factors, such as the sensor node placement, anchor deployment or network topology. In this paper, we propose a robust localization algorithm called Hybrid Efficient and Accurate Localization (HEA-Loc). HEA-Loc combines two techniques, Extended Kalman Filter (EKF) and Proximity-Distance Map (PDM) to improve localization accuracy. It is distributed in nature and works well in various scenarios as it is less susceptible to anchors deployment and the network topology. Furthermore, HEA-Loc has strong robustness and it can work well even the measurement errors are large. Simulation results show that HEA-Loc outperforms existing algorithms in both computational complexity and communication overhead. ©2010 IEEE.published_or_final_versionThe IEEE Wireless Communications and Networking Conference (WCNC 2010), Sydney, NSW., 18-21 April 2010. In Proceedings of WCNC, 2010, p. 1-

    Doctor of Philosophy

    Get PDF
    dissertationIn wireless sensor networks, knowing the location of the wireless sensors is critical in many remote sensing and location-based applications, from asset tracking, and structural monitoring to geographical routing. For a majority of these applications, received signal strength (RSS)-based localization algorithms are a cost effective and viable solution. However, RSS measurements vary unpredictably because of fading, the shadowing caused by presence of walls and obstacles in the path, and non-isotropic antenna gain patterns, which affect the performance of the RSS-based localization algorithms. This dissertation aims to provide efficient models for the measured RSS and use the lessons learned from these models to develop and evaluate efficient localization algorithms. The first contribution of this dissertation is to model the correlation in shadowing across link pairs. We propose a non-site specific statistical joint path loss model between a set of static nodes. Radio links that are geographically proximate often experience similar environmental shadowing effects and thus have correlated shadowing. Using a large number of multi-hop network measurements in an ensemble of indoor and outdoor environments, we show statistically significant correlations among shadowing experienced on different links in the network. Finally, we analyze multihop paths in three and four node networks using both correlated and independent shadowing models and show that independent shadowing models can underestimate the probability of route failure by a factor of two or greater. Second, we study a special class of algorithms, called kernel-based localization algorithms, that use kernel methods as a tool for learning correlation between the RSS measurements. Kernel methods simplify RSS-based localization algorithms by providing a means to learn the complicated relationship between RSS measurements and position. We present a common mathematical framework for kernel-based localization algorithms to study and compare the performance of four different kernel-based localization algorithms from the literature. We show via simulations and an extensive measurement data set that kernel-based localization algorithms can perform better than model-based algorithms. Results show that kernel methods can achieve an RMSE up to 55% lower than a model-based algorithm. Finally, we propose a novel distance estimator for estimating the distance between two nodes a and b using indirect link measurements, which are the measurements made between a and k, for k ? b and b and k, for k ? a. Traditionally, distance estimators use only direct link measurement, which is the pairwise measurement between the nodes a and b. The results show that the estimator that uses indirect link measurements enables better distance estimation than the estimator that uses direct link measurements
    corecore