696 research outputs found

    Machine Learning for Identifying Group Trajectory Outliers

    Get PDF
    Prior works on the trajectory outlier detection problem solely consider individual outliers. However, in real-world scenarios, trajectory outliers can often appear in groups, e.g., a group of bikes that deviates to the usual trajectory due to the maintenance of streets in the context of intelligent transportation. The current paper considers the Group Trajectory Outlier (GTO) problem and proposes three algorithms. The first and the second algorithms are extensions of the well-known DBSCAN and kNN algorithms, while the third one models the GTO problem as a feature selection problem. Furthermore, two different enhancements for the proposed algorithms are proposed. The first one is based on ensemble learning and computational intelligence, which allows for merging algorithms’ outputs to possibly improve the final result. The second is a general high-performance computing framework that deals with big trajectory databases, which we used for a GPU-based implementation. Experimental results on different real trajectory databases show the scalability of the proposed approaches.acceptedVersio

    Designing a streaming algorithm for outlier detection in data mining—an incrementa approach

    Get PDF
    To design an algorithm for detecting outliers over streaming data has become an important task in many common applications, arising in areas such as fraud detections, network analysis, environment monitoring and so forth. Due to the fact that real-time data may arrive in the form of streams rather than batches, properties such as concept drift, temporal context, transiency, and uncertainty need to be considered. In addition, data processing needs to be incremental with limited memory resource, and scalable. These facts create big challenges for existing outlier detection algorithms in terms of their accuracies when they are implemented in an incremental fashion, especially in the streaming environment. To address these problems, we first propose C_KDE_WR, which uses sliding window and kernel function to process the streaming data online, and reports its results demonstrating high throughput on handling real-time streaming data, implemented in a CUDA framework on Graphics Processing Unit (GPU). We also present another algorithm, C_LOF, based on a very popular and effective outlier detection algorithm called Local Outlier Factor (LOF) which unfortunately works only on batched data. Using a novel incremental approach that compensates the drawback of high complexity in LOF, we show how to implement it in a streaming context and to obtain results in a timely manner. Like C_KDE_WR, C_LOF also employs sliding-window and statistical-summary to help making decision based on the data in the current window. It also addresses all those challenges of streaming data as addressed in C_KDE_WR. In addition, we report the comparative evaluation on the accuracy of C_KDE_WR with the state-of-the-art SOD_GPU using Precision, Recall and F-score metrics. Furthermore, a t-test is also performed to demonstrate the significance of the improvement. We further report the testing results of C_LOF on different parameter settings and drew ROC and PR curve with their area under the curve (AUC) and Average Precision (AP) values calculated respectively. Experimental results show that C_LOF can overcome the masquerading problem, which often exists in outlier detection on streaming data. We provide complexity analysis and report experiment results on the accuracy of both C_KDE_WR and C_LOF algorithms in order to evaluate their effectiveness as well as their efficiencies

    Neuromorphic Learning Systems for Supervised and Unsupervised Applications

    Get PDF
    The advancements in high performance computing (HPC) have enabled the large-scale implementation of neuromorphic learning models and pushed the research on computational intelligence into a new era. Those bio-inspired models are constructed on top of unified building blocks, i.e. neurons, and have revealed potentials for learning of complex information. Two major challenges remain in neuromorphic computing. Firstly, sophisticated structuring methods are needed to determine the connectivity of the neurons in order to model various problems accurately. Secondly, the models need to adapt to non-traditional architectures for improved computation speed and energy efficiency. In this thesis, we address these two problems and apply our techniques to different cognitive applications. This thesis first presents the self-structured confabulation network for anomaly detection. Among the machine learning applications, unsupervised detection of the anomalous streams is especially challenging because it requires both detection accuracy and real-time performance. Designing a computing framework that harnesses the growing computing power of the multicore systems while maintaining high sensitivity and specificity to the anomalies is an urgent research need. We present AnRAD (Anomaly Recognition And Detection), a bio-inspired detection framework that performs probabilistic inferences. We leverage the mutual information between the features and develop a self-structuring procedure that learns a succinct confabulation network from the unlabeled data. This network is capable of fast incremental learning, which continuously refines the knowledge base from the data streams. Compared to several existing anomaly detection methods, the proposed approach provides competitive detection accuracy as well as the insight to reason the decision making. Furthermore, we exploit the massive parallel structure of the AnRAD framework. Our implementation of the recall algorithms on the graphic processing unit (GPU) and the Xeon Phi co-processor both obtain substantial speedups over the sequential implementation on general-purpose microprocessor (GPP). The implementation enables real-time service to concurrent data streams with diversified contexts, and can be applied to large problems with multiple local patterns. Experimental results demonstrate high computing performance and memory efficiency. For vehicle abnormal behavior detection, the framework is able to monitor up to 16000 vehicles and their interactions in real-time with a single commodity co-processor, and uses less than 0.2ms for each testing subject. While adapting our streaming anomaly detection model to mobile devices or unmanned systems, the key challenge is to deliver required performance under the stringent power constraint. To address the paradox between performance and power consumption, brain-inspired hardware, such as the IBM Neurosynaptic System, has been developed to enable low power implementation of neural models. As a follow-up to the AnRAD framework, we proposed to port the detection network to the TrueNorth architecture. Implementing inference based anomaly detection on a neurosynaptic processor is not straightforward due to hardware limitations. A design flow and the supporting component library are developed to flexibly map the learned detection networks to the neurosynaptic cores. Instead of the popular rate code, burst code is adopted in the design, which represents numerical value using the phase of a burst of spike trains. This does not only reduce the hardware complexity, but also increases the result\u27s accuracy. A Corelet library, NeoInfer-TN, is implemented for basic operations in burst code and two-phase pipelines are constructed based on the library components. The design can be configured for different tradeoffs between detection accuracy, hardware resource consumptions, throughput and energy. We evaluate the system using network intrusion detection data streams. The results show higher detection rate than some conventional approaches and real-time performance, with only 50mW power consumption. Overall, it achieves 10^8 operations per Joule. In addition to the modeling and implementation of unsupervised anomaly detection, we also investigate a supervised learning model based on neural networks and deep fragment embedding and apply it to text-image retrieval. The study aims at bridging the gap between image and natural language. It continues to improve the bidirectional retrieval performance across the modalities. Unlike existing works that target at single sentence densely describing the image objects, we elevate the topic to associating deep image representations with noisy texts that are only loosely correlated. Based on text-image fragment embedding, our model employs a sequential configuration, connects two embedding stages together. The first stage learns the relevancy of the text fragments, and the second stage uses the filtered output from the first one to improve the matching results. The model also integrates multiple convolutional neural networks (CNN) to construct the image fragments, in which rich context information such as human faces can be extracted to increase the alignment accuracy. The proposed method is evaluated with both synthetic dataset and real-world dataset collected from picture news website. The results show up to 50% ranking performance improvement over the comparison models

    A high-performance IoT solution to reduce frost damages in stone fruits

    Full text link
    [EN] Agriculture is one of the key sectors where technology is opening new opportunities to break up the market. The Internet of Things (IoT) could reduce the production costs and increase the product quality by providing intelligence services via IoT analytics. However, the hard weather conditions and the lack of connectivity in this field limit the successful deployment of such services as they require both, ie, fully connected infrastructures and highly computational resources. Edge computing has emerged as a solution to bring computing power in close proximity to the sensors, providing energy savings, highly responsive web services, and the ability to mask transient cloud outages. In this paper, we propose an IoT monitoring system to activate anti-frost techniques to avoid crop loss, by defining two intelligent services to detect outliers caused by the sensor errors. The former is a nearest neighbor technique and the latter is the k-means algorithm, which provides better quality results but it increases the computational cost. Cloud versus edge computing approaches are analyzed by targeting two different low-power GPUs. Our experimental results show that cloud-based approaches provides highest performance in general but edge computing is a compelling alternative to mask transient cloud outages and provide highly responsive data analytic services in technologically hostile environments.This work was partially supported by the Fundación Séneca del Centro de Coordinación de la Investigación de la Región de Murcia under Project 20813/PI/18, and by Spanish Ministry of Science, Innovation and Universities under grants TIN2016-78799-P (AEI/FEDER, UE) and RTC-2017-6389-5. Finally, we thank the farmers for the availability of their resources to be able to asses and improve the IoT monitoring system proposed.Guillén-Navarro, MA.; Martínez-España, R.; López, B.; Cecilia-Canales, JM. (2021). A high-performance IoT solution to reduce frost damages in stone fruits. Concurrency and Computation: Practice and Experience (Online). 33(2):1-14. https://doi.org/10.1002/cpe.529911433

    The Lannion report on Big Data and Security Monitoring Research

    Get PDF
    International audienceDuring the last decade, big data management has attracted increasing interest from both the industrial and academic communities. In parallel, Cyber Security has become mandatory due to various and more intensive threats. In June 2022, a group of researchers has met to reflect on their community's impacts on current research challenges. In particular, they have considered four dimensions: (1) dedicated systems being data processing and analytic platforms or time series management systems; (2) graphs analytics and distributed computation; (3) privacy; and (4) new hardware

    PoseFusion2: Simultaneous Background Reconstruction and Human Shape Recovery in Real-time

    Get PDF

    マルチレベル並列化とアプリケーション指向データレイアウトを用いるハードウェアアクセラレータの設計と実装

    Get PDF
    学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 稲葉 雅幸, 東京大学教授 須田 礼仁, 東京大学教授 五十嵐 健夫, 東京大学教授 山西 健司, 東京大学准教授 稲葉 真理, 東京大学講師 中山 英樹University of Tokyo(東京大学

    Trajectory outlier detection: New problems and solutions for smart cities

    Get PDF
    This article introduces two new problems related to trajectory outlier detection: (1) group trajectory outlier (GTO) detection and (2) deviation point detection for both individual and group of trajectory outliers. Five algorithms are proposed for the first problem by adapting DBSCAN, k nearest neighbors (kNN), and feature selection (FS). DBSCAN-GTO first applies DBSCAN to derive the micro clusters, which are considered as potential candidates. A pruning strategy based on density computation measure is then suggested to find the group of trajectory outliers. kNN-GTO recursively derives the trajectory candidates from the individual trajectory outliers and prunes them based on their density. The overall process is repeated for all individual trajectory outliers. FS-GTO considers the set of individual trajectory outliers as the set of all features, while the FS process is used to retrieve the group of trajectory outliers. The proposed algorithms are improved by incorporating ensemble learning and high-performance computing during the detection process. Moreover, we propose a general two-phase-based algorithm for detecting the deviation points, as well as a version for graphic processing units implementation using sliding windows. Experiments on a real trajectory dataset have been carried out to demonstrate the performance of the proposed approaches. The results show that they can efficiently identify useful patterns represented by group of trajectory outliers, deviation points, and that they outperform the baseline group detection algorithms
    corecore