346 research outputs found

    Abnormal behavior detection using tensor factorization

    Get PDF
    Real Time Location Systems (RTLS) using RFID is a popular surveillance method for security. However, in an open and dynamic environment, where patterns rarely repeat, it is difficult to implement a model that could analyse all the information generated in real-time and detect abnormal events. In this thesis, we present a new approach to analyze the spatio-temporal information generated by RTL systems. In this approach, discrete events capturing the location of a given person (or object), at a specific time, on a certain day are generated and stored at fixed intervals. Using a latent semantic analysis (LSA) technique based on tensor factorization we extract and leverage latent information contained in real-time streams of multidimensional RFID data represented as these discrete events. One of the main contributions made through this work is a parametric Log-Linear Tensor Factorization (LLTF) model which learns the jointprobability of contextual elements, in which the parameters are the factors of the event tensor. Using an efficient method, based on Nesterov’s accelerated gradient, we obtain a trained model, which is a set of latent factors representing a summarized version of the multidimensional data corresponding to high-level descriptions of each dimension. We evaluated this approach based on LLTF through a series of experiments on synthetic as well as real-life datasets. Upon comparative analysis, results showed that our proposed approach outperformed some state-ofthe-art methods for factor analysis via tensor factorization and abnormal behaviour detection

    Data Monitoring and Analysis in Wireless Networks

    Full text link
    Various wireless network technologies have been created to meet the ever-increasing demand for wireless access to the Internet, such as wireless local area network, cellular network, sensor network and many more. The communication devices have transformed from large computational servers to small wireless hand-held devices, ranging from laptops, tablets, smartphones to small sensors. The advances of these wireless networks (e.g., faster network speed) and their intensive usages result in an enormous growth of network data in terms of volume, diversity, and complexity. All of these changes have raised complicated issues of network measurement and management. In the first part of this thesis, I study how WiFi network characteristics impact network forensics investigation and home security monitoring. I first focus on network forensics investigation and propose a wireless forensic monitoring system to collect trace digests of WiFi activities and facilitate cybercrime investigation. Then, I design and develop a low-cost home security system based on WiFi networks for physical intruder detection. Two methods - MAC-based detection and RSSI-variance-based detection, are proposed based on the characteristics of WiFi networks. In the second part, I study how to effectively and efficiently model multiple coevolving time series, which is ubiquitous in network measurement especially in wireless sensor networks. Two comprehensive algorithms are proposed to address three prominent challenges of mining coevolving sensor measured traces: (a) high order; (b) contextual constraints; and (c) temporal smoothness

    固有値分解とテンソル分解を用いた大規模グラフデータ分析に関する研究

    Get PDF
    筑波大学 (University of Tsukuba)201

    No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling

    Full text link
    Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variability depending on the machine learning algorithm. Furthermore, the distortions can be misleading when regarding cluster geometry. Amongst the causes, the presence of outliers and anomalies can be a determining factor. Despite the relevance of initialization and outlier issues for text clustering and topic modeling, the authors did not find an in-depth analysis of them. This survey provides a systematic literature review (2011-2022) of these subareas and proposes a common terminology since similar procedures have different terms. The authors describe research opportunities, trends, and open issues. The appendices summarize the theoretical background of the text vectorization, the factorization, and the clustering algorithms that are directly or indirectly related to the reviewed works
    corecore