8 research outputs found

    On effective classification of strings with wavelets

    Get PDF

    A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

    Get PDF
    The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework

    Traj-ARIMA: A Spatial-Time Series Model for Network-Constrained Trajectory

    Get PDF
    Trajectory data play an important role in analyzing real world applications that involve movement features, e.g. natural and social phenomena such as bird migration, transportation management, urban planning and tourism analysis. Such trajectory data are a special kind of time series with another focus on the spatial dimension besides the temporal one. Traditional time series models, especially the ARIMA (Auto-Regression Integrated Moving Average) model, have provided sound theoretical backgrounds and promoted many successful applications for managing and forecasting time-relevant sequential data. This paper aims at extending the ARIMA model with spatial dimension, and further applying it for the network-constrained trajectory data. We implement and evaluate the model for trajectory database, in the context of traffic application scenario about vehicle movement constrained under a given network infrastructure. The proposed Traj-ARIMA model has many application perspectives, such as trajectory data regression and compression, outliers detection, traffic flow and vehicle speed prediction. In this paper, the major focus is on vehicle speed forecasting

    System Development for Detecting Outlier Transactions

    Get PDF
    魅力ある大学院教育イニシアティブ:実践IT力を備えた高度情報学人材育成プログラ

    Finding Anomalous Periodic Time Series: An Application to Catalogs of Periodic Variable Stars

    Full text link
    Catalogs of periodic variable stars contain large numbers of periodic light-curves (photometric time series data from the astrophysics domain). Separating anomalous objects from well-known classes is an important step towards the discovery of new classes of astronomical objects. Most anomaly detection methods for time series data assume either a single continuous time series or a set of time series whose periods are aligned. Light-curve data precludes the use of these methods as the periods of any given pair of light-curves may be out of sync. One may use an existing anomaly detection method if, prior to similarity calculation, one performs the costly act of aligning two light-curves, an operation that scales poorly to massive data sets. This paper presents PCAD, an unsupervised anomaly detection method for large sets of unsynchronized periodic time-series data, that outputs a ranked list of both global and local anomalies. It calculates its anomaly score for each light-curve in relation to a set of centroids produced by a modified k-means clustering algorithm. Our method is able to scale to large data sets through the use of sampling. We validate our method on both light-curve data and other time series data sets. We demonstrate its effectiveness at finding known anomalies, and discuss the effect of sample size and number of centroids on our results. We compare our method to naive solutions and existing time series anomaly detection methods for unphased data, and show that PCAD's reported anomalies are comparable to or better than all other methods. Finally, astrophysicists on our team have verified that PCAD finds true anomalies that might be indicative of novel astrophysical phenomena

    Event detection in high throughput social media

    Get PDF
    corecore