6 research outputs found

    Exact and Heuristic Approaches to Speeding Up the MSM Time Series Distance Computation

    Full text link
    The computation of the distance of two time series is time-consuming for any elastic distance function that accounts for misalignments. Among those functions, DTW is the most prominent. However, a recent extensive evaluation has shown that the move-split merge (MSM) metric is superior to DTW regarding the analytical accuracy of the 1-NN classifier. Unfortunately, the running time of the standard dynamic programming algorithm for MSM distance computation is Ω(n2)\Omega(n^2), where nn is the length of the longest time series. In this paper, we provide approaches to reducing the cost of MSM distance computations by using lower and upper bounds for early pruning paths in the underlying dynamic programming table. For the case of one time series being a constant, we present a linear-time algorithm. In addition, we propose new linear-time heuristics and adapt heuristics known from DTW to computing the MSM distance. One heuristic employs the metric property of MSM and the previously introduced linear-time algorithm. Our experimental studies demonstrate substantial speed-ups in our approaches compared to previous MSM algorithms. In particular, the running time for MSM is faster than a state-of-the-art DTW distance computation for a majority of the popular UCR data sets

    An Experimental Evaluation of Time Series Classification Using Various Distance Measures

    Get PDF
    In recent years a vast number of distance measures for time series classification has been proposed. Obviously, the definition of a distance measure is crucial to further data mining tasks, thus there is a need to decide which measure should we choose for a particular dataset. The objective of this study is to provide a comprehensive comparison of 26 distance measures enriched with extensive statistical analysis. We compare different kinds of distance measures: shape-based, edit-based, feature-based and structure-based. Experimental results carried out on 34 benchmark datasets from UCR Time Series Classification Archive are provided. We use an one nearest neighbour (1NN) classifier to compare the efficiency of the examined measures. Computation times were taken into consideration as well

    Fast time series classification under lucky time warping distance

    No full text

    Usability, Efficiency and Security of Personal Computing Technologies

    Get PDF
    New personal computing technologies such as smartphones and personal fitness trackers are widely integrated into user lifestyles. Users possess a wide range of skills, attributes and backgrounds. It is important to understand user technology practices to ensure that new designs are usable and productive. Conversely, it is important to leverage our understanding of user characteristics to optimize new technology efficiency and effectiveness. Our work initially focused on studying older users, and personal fitness tracker users. We applied the insights from these investigations to develop new techniques improving user security protections, computational efficiency, and also enhancing the user experience. We offer that by increasing the usability, efficiency and security of personal computing technology, users will enjoy greater privacy protections along with experiencing greater enjoyment of their personal computing devices. Our first project resulted in an improved authentication system for older users based on familiar facial images. Our investigation revealed that older users are often challenged by traditional text passwords, resulting in decreased technology use or less than optimal password practices. Our graphical password-based system relies on memorable images from the user\u27s personal past history. Our usability study demonstrated that this system was easy to use, enjoyable, and fast. We show that this technique is extendable to smartphones. Personal fitness trackers are very popular devices, often worn by users all day. Our personal fitness tracker investigation provides the first quantitative baseline of usage patterns with this device. By exploring public data, real-world user motivations, reliability concerns, activity levels, and fitness-related socialization patterns were discerned. This knowledge lends insight to active user practices. Personal user movement data is captured by sensors, then analyzed to provide benefits to the user. The dynamic time warping technique enables comparison of unequal data sequences, and sequences containing events at offset times. Existing techniques target short data sequences. Our Phase-aware Dynamic Time Warping algorithm focuses on a class of sinusoidal user movement patterns, resulting in improved efficiency over existing methods. Lastly, we address user data privacy concerns in an environment where user data is increasingly flowing to manufacturer remote cloud servers for analysis. Our secure computation technique protects the user\u27s privacy while data is in transit and while resident on cloud computing resources. Our technique also protects important data on cloud servers from exposure to individual users

    Uloga mera sličnosti u analizi vremenskih serija

    Get PDF
    The subject of this dissertation encompasses a comprehensive overview and analysis of the impact of Sakoe-Chiba global constraint on the most commonly used elastic similarity measures in the field of time-series data mining with a focus on classification accuracy. The choice of similarity measure is one of the most significant aspects of time-series analysis  -  it should correctly reflect the resemblance between the data presented in the form of time series. Similarity measures represent a critical component of many tasks of mining time series, including: classification, clustering, prediction, anomaly detection, and others. The research covered by this dissertation is oriented on several issues: 1.  review of the effects of  global constraints on the performance of computing similarity measures, 2.  a detailed analysis of the influence of constraining the elastic similarity measures on the accuracy of classical classification techniques, 3.  an extensive study of the impact of different weighting schemes on the classification of time series, 4.  development of an open source library that integrates the main techniques and methods required for analysis and mining time series, and which is used for the realization of these experimentsPredmet istraživanja ove disertacije obuhvata detaljan pregled i analizu uticaja Sakoe-Chiba globalnog ograničenja na najčešće korišćene elastične mere sličnosti u oblasti data mining-a vremenskih serija sa naglaskom na tačnost klasifikacije. Izbor mere sličnosti jedan je od najvažnijih aspekata analize vremenskih serija  -  ona treba  verno reflektovati sličnost između podataka prikazanih u obliku vremenskih serija.  Mera sličnosti predstavlјa kritičnu komponentu mnogih zadataka  mining-a vremenskih serija, uklјučujući klasifikaciju, grupisanje (eng.  clustering), predviđanje, otkrivanje anomalija i drugih. Istraživanje obuhvaćeno ovom disertacijom usmereno je na nekoliko pravaca: 1.  pregled efekata globalnih ograničenja na performanse računanja mera sličnosti, 2.  detalјna analiza posledice ograničenja elastičnih mera sličnosti na tačnost klasifikacije klasičnih tehnika klasifikacije, 3.  opsežna studija uticaj različitih načina računanja težina (eng. weighting scheme) na klasifikaciju vremenskih serija, 4.  razvoj biblioteke otvorenog koda (Framework for Analysis and Prediction  -  FAP) koja će integrisati glavne tehnike i metode potrebne za analizu i mining  vremenskih serija i koja je korišćena za realizaciju ovih eksperimenata.Predmet istraživanja ove disertacije obuhvata detaljan pregled i analizu uticaja Sakoe-Chiba globalnog ograničenja na najčešće korišćene elastične mere sličnosti u oblasti data mining-a vremenskih serija sa naglaskom na tačnost klasifikacije. Izbor mere sličnosti jedan je od najvažnijih aspekata analize vremenskih serija  -  ona treba  verno reflektovati sličnost između podataka prikazanih u obliku vremenskih serija.  Mera sličnosti predstavlja kritičnu komponentu mnogih zadataka  mining-a vremenskih serija, uključujući klasifikaciju, grupisanje (eng.  clustering), predviđanje, otkrivanje anomalija i drugih. Istraživanje obuhvaćeno ovom disertacijom usmereno je na nekoliko pravaca: 1.  pregled efekata globalnih ograničenja na performanse računanja mera sličnosti, 2.  detaljna analiza posledice ograničenja elastičnih mera sličnosti na tačnost klasifikacije klasičnih tehnika klasifikacije, 3.  opsežna studija uticaj različitih načina računanja težina (eng. weighting scheme) na klasifikaciju vremenskih serija, 4.  razvoj biblioteke otvorenog koda (Framework for Analysis and Prediction  -  FAP) koja će integrisati glavne tehnike i metode potrebne za analizu i mining  vremenskih serija i koja je korišćena za realizaciju ovih eksperimenata
    corecore