111 research outputs found

    Searching and mining trillions of time series subsequences under dynamic time warping

    Full text link

    Time Series Data Mining: A Retail Application Using SAS Enterprise Miner

    Get PDF
    Modern technologies have allowed for the amassment of data at a rate never encountered before. Organizations are now able to routinely collect and process massive volumes of data. A plethora of regularly collected information can be ordered using an appropriate time interval. The data would thus be developed into a time series. With such data, analytical techniques can be employed to collect information pertaining to historical trends and seasonality. Time series data mining methodology allows users to identify commonalities between sets of time-ordered data. This technique is supported by a variety of algorithms, notably dynamic time warping (DTW). This mathematical technique supports the identification of similarities between numerous time series. The following research aims to provide a practical application of this methodology using SAS Enterprise Miner, an industry-leading software platform for business analytics. Due to the prevalence of time series data in retail settings, a realistic product sales transaction data set was analyzed. This information was provided by dunnhumbyUSA. Interpretations were drawn from output that was generated using “TS nodes” in SAS Enterprise Miner

    The Use of MPI and OpenMP Technologies for Subsequence Similarity Search in Very Large Time Series on Computer Cluster System with Nodes Based on the Intel Xeon Phi Knights Landing Many-core Processor

    Full text link
    Nowadays, subsequence similarity search is required in a wide range of time series mining applications: climate modeling, financial forecasts, medical research, etc. In most of these applications, the Dynamic TimeWarping (DTW) similarity measure is used since DTW is empirically confirmed as one of the best similarity measure for most subject domains. Since the DTW measure has a quadratic computational complexity w.r.t. the length of query subsequence, a number of parallel algorithms for various many-core architectures have been developed, namely FPGA, GPU, and Intel MIC. In this article, we propose a new parallel algorithm for subsequence similarity search in very large time series on computer cluster systems with nodes based on Intel Xeon Phi Knights Landing (KNL) many-core processors. Computations are parallelized on two levels as follows: through MPI at the level of all cluster nodes, and through OpenMP within one cluster node. The algorithm involves additional data structures and redundant computations, which make it possible to effectively use the capabilities of vector computations on Phi KNL. Experimental evaluation of the algorithm on real-world and synthetic datasets shows that it is highly scalable.Comment: Accepted for publication in the "Numerical Methods and Programming" journal (http://num-meth.srcc.msu.ru/english/, in Russian "Vychislitelnye Metody i Programmirovanie"), in Russia

    Interactive time series analytics powered by ONEX

    Get PDF
    Modern applications in this digital age collect a staggering amount of time series data from economic growth rates to electrical household consumption habits. To make sense of it, domain analysts interactively sift through these time series collections in search of critical relationships between and recurring patterns within these time series. The ONEX (Online Exploration of Time Series) system supports effective exploratory analysis of time series collections composed of heterogeneous, variable-length and misaligned time series using robust alignment dynamic time warping (DTW) methods. To assure real-time responsiveness even for these complex and compute-intensive analytics, ONEX precomputes and then encodes time series relationships based on the inexpensive-to-compute Euclidean distance into the ONEX base. Thereafter, based on a solid formal foundation, ONEX uses DTW-enhanced analytics to correctly extract relevant time series matches on this Euclidean-prepared ONEX base. Our live interactive demonstration shows how our ONEX exploratory tool, supported by a rich array of visual interactions and expressive visualizations, enables efficient mining and interpretation of the MATTERS real data collection composed of economic, social, and education data trends across the fifty American states. © 2017 ACM

    Subjectively interesting motifs in time series

    Get PDF
    This paper introduces an approach to find motifs in time series that are \emph{subjectively interesting}. That is, the aim is to find motifs that are surprising given an informative background distribution, which may for example correspond to the prior knowledge of a user of the tool. We quantify this surprisal using information theory, and more particularly the FORSIED framework. The resulting interestingness function according to which motifs are ranked is then subjective in the statistical sense, enabling us to find subsequence patterns (i.e., motifs and outliers) that are more truly interesting. Although finding the best motif appears intractable, we develop relaxations and a branch-and-bound approach that is implemented in a constraint programming solver. As shown in experiments on synthetic data and two real-world data sets this enables us to mine interesting patterns in small or mid-sized time series
    corecore