528 research outputs found

    Trajectory Clustering and an Application to Airspace Monitoring

    Get PDF
    This paper presents a framework aimed at monitoring the behavior of aircraft in a given airspace. Nominal trajectories are determined and learned using data driven methods. Standard procedures are used by air traffic controllers (ATC) to guide aircraft, ensure the safety of the airspace, and to maximize the runway occupancy. Even though standard procedures are used by ATC, the control of the aircraft remains with the pilots, leading to a large variability in the flight patterns observed. Two methods to identify typical operations and their variability from recorded radar tracks are presented. This knowledge base is then used to monitor the conformance of current operations against operations previously identified as standard. A tool called AirTrajectoryMiner is presented, aiming at monitoring the instantaneous health of the airspace, in real time. The airspace is "healthy" when all aircraft are flying according to the nominal procedures. A measure of complexity is introduced, measuring the conformance of current flight to nominal flight patterns. When an aircraft does not conform, the complexity increases as more attention from ATC is required to ensure a safe separation between aircraft.Comment: 15 pages, 20 figure

    Hybrid group anomaly detection for sequence data: application to trajectory data analytics

    Get PDF
    Many research areas depend on group anomaly detection. The use of group anomaly detection can maintain and provide security and privacy to the data involved. This research attempts to solve the deficiency of the existing literature in outlier detection thus a novel hybrid framework to identify group anomaly detection from sequence data is proposed in this paper. It proposes two approaches for efficiently solving this problem: i) Hybrid Data Mining-based algorithm, consists of three main phases: first, the clustering algorithm is applied to derive the micro-clusters. Second, the kNN algorithm is applied to each micro-cluster to calculate the candidates of the group's outliers. Third, a pattern mining framework gets applied to the candidates of the group's outliers as a pruning strategy, to generate the groups of outliers, and ii) a GPU-based approach is presented, which benefits from the massively GPU computing to boost the runtime of the hybrid data mining-based algorithm. Extensive experiments were conducted to show the advantages of different sequence databases of our proposed model. Results clearly show the efficiency of a GPU direction when directly compared to a sequential approach by reaching a speedup of 451. In addition, both approaches outperform the baseline methods for group detection.acceptedVersio

    Evaluation of mineralogy per geological layers by Approximate Bayesian Computation

    Full text link
    We propose a new methodology to perform mineralogic inversion from wellbore logs based on a Bayesian linear regression model. Our method essentially relies on three steps. The first step makes use of Approximate Bayesian Computation (ABC) and selects from the Bayesian generator a set of candidates-volumes corresponding closely to the wellbore data responses. The second step gathers these candidates through a density-based clustering algorithm. A mineral scenario is assigned to each cluster through direct mineralogical inversion, and we provide a confidence estimate for each lithological hypothesis. The advantage of this approach is to explore all possible mineralogy hypotheses that match the wellbore data. This pipeline is tested on both synthetic and real datasets

    Novelty Detection And Cluster Analysis In Time Series Data Using Variational Autoencoder Feature Maps

    Get PDF
    The identification of atypical events and anomalies in complex data systems is an essential yet challenging task. The dynamic nature of these systems produces huge volumes of data that is often heterogeneous, and the failure to account for this will impede the detection of anomalies. Time series data encompass these issues and its high dimensional nature intensifies these challenges. This research presents a framework for the identification of anomalies in temporal data. A comparative analysis of Centroid, Density and Neural Network-based clustering techniques was performed and their scalability was assessed. This facilitated the development of a new algorithm called the Variational Autoencoder Feature Map (VAEFM) which is an ensemble method that is based on Kohonen’s Self-Organizing Maps (SOM) and Variational Autoencoders. The VAEFM is an unsupervised learning algorithm that models the distribution of temporal data without making a priori assumptions. It incorporates principles of novelty detection to enhance the representational capacity of SOMs neurons, which improves their ability to generalize with novel data. The VAEFM technique was demonstrated on a dataset of accumulated aircraft sensor recordings, to detect atypical events that transpired in the approach phase of flight. This is a proactive means of accident prevention and is therefore advantageous to the Aviation industry. Furthermore, accumulated aircraft data presents big data challenges, which requires scalable analytical solutions. The results indicated that VAEFM successfully identified temporal dependencies in the flight data and produced several clusters and outliers. It analyzed over 2500 flights in under 5 minutes and identified 12 clusters, two of which contained stabilized approaches. The remaining comprised of aborted approaches, excessively high/fast descent patterns and other contributory factors for unstabilized approaches. Outliers were detected which revealed oscillations in aircraft trajectories; some of which would have a lower detection rate using traditional flight safety analytical techniques. The results further indicated that VAEFM facilitates large-scale analysis and its scaling efficiency was demonstrated on a High Performance Computing System, by using an increased number of processors, where it achieved an average speedup of 70%

    Featured Anomaly Detection Methods and Applications

    Get PDF
    Anomaly detection is a fundamental research topic that has been widely investigated. From critical industrial systems, e.g., network intrusion detection systems, to people’s daily activities, e.g., mobile fraud detection, anomaly detection has become the very first vital resort to protect and secure public and personal properties. Although anomaly detection methods have been under consistent development over the years, the explosive growth of data volume and the continued dramatic variation of data patterns pose great challenges on the anomaly detection systems and are fuelling the great demand of introducing more intelligent anomaly detection methods with distinct characteristics to cope with various needs. To this end, this thesis starts with presenting a thorough review of existing anomaly detection strategies and methods. The advantageous and disadvantageous of the strategies and methods are elaborated. Afterward, four distinctive anomaly detection methods, especially for time series, are proposed in this work aiming at resolving specific needs of anomaly detection under different scenarios, e.g., enhanced accuracy, interpretable results, and self-evolving models. Experiments are presented and analysed to offer a better understanding of the performance of the methods and their distinct features. To be more specific, the abstracts of the key contents in this thesis are listed as follows: 1) Support Vector Data Description (SVDD) is investigated as a primary method to fulfill accurate anomaly detection. The applicability of SVDD over noisy time series datasets is carefully examined and it is demonstrated that relaxing the decision boundary of SVDD always results in better accuracy in network time series anomaly detection. Theoretical analysis of the parameter utilised in the model is also presented to ensure the validity of the relaxation of the decision boundary. 2) To support a clear explanation of the detected time series anomalies, i.e., anomaly interpretation, the periodic pattern of time series data is considered as the contextual information to be integrated into SVDD for anomaly detection. The formulation of SVDD with contextual information maintains multiple discriminants which help in distinguishing the root causes of the anomalies. 3) In an attempt to further analyse a dataset for anomaly detection and interpretation, Convex Hull Data Description (CHDD) is developed for realising one-class classification together with data clustering. CHDD approximates the convex hull of a given dataset with the extreme points which constitute a dictionary of data representatives. According to the dictionary, CHDD is capable of representing and clustering all the normal data instances so that anomaly detection is realised with certain interpretation. 4) Besides better anomaly detection accuracy and interpretability, better solutions for anomaly detection over streaming data with evolving patterns are also researched. Under the framework of Reinforcement Learning (RL), a time series anomaly detector that is consistently trained to cope with the evolving patterns is designed. Due to the fact that the anomaly detector is trained with labeled time series, it avoids the cumbersome work of threshold setting and the uncertain definitions of anomalies in time series anomaly detection tasks

    An analysis of vessel waypoint behavior through data clustering

    Get PDF
    In this thesis, we cluster stop points into stop-point regions using one month’s Automatic Identification System (AIS) data from the Gulf of Mexico and Caribbean Sea to characterize vessel behavior in an area with diverse traffic patterns. Initial cleaning of the dataset is necessary to address multiple issues common to AIS transponders. We consider methods for computing inter-point distances. In particular, we study a promising method for combining geospatial coordinates with other vessel attributes. We use the Ordering Points To Identify the Cluster Structure (OPTICS) clustering algorithm because it can identify outliers, and it constructs clusters of varying shapes and densities. Our best results come from dividing the area of interest into seven zones of equal size, and analyzing the results over each zone. Using classification trees to develop a classification tool, we illustrate an approach for predicting the cluster membership of a new observation. Due to the reduction in computation time and accuracy of results, we recommend that further research utilize the methods from this study as the foundation for an automated threat detection system.http://archive.org/details/annalysisofvesse1094556135Ensign, United States NavyApproved for public release; distribution is unlimited
    • …
    corecore