17 research outputs found

    Finding long and similar parts of trajectories

    Full text link
    A natural time-dependent similarity measure for two trajectories is their average distance at corresponding times. We give algorithms for computing the most similar subtrajectories under this measure, assuming the two trajectories are given as two polygonal, possibly self-intersecting lines. When a minimum duration is specified for the subtrajectories, and they must start at exactly corresponding times in the input trajectories, we give a linear-time algorithm for computing the starting time and duration of the most similar subtrajectories. The algorithm is based on a result of independent interest: We present a linear-time algorithm to find, for a piece-wise monotone function, an interval of at least a given length that has minimum average value. When the two subtrajectories can start at different times in the two input trajectories, it appears difficult to give an exact algorithm for the most similar subtrajectories problem, even if the duration of the desired two subtrajectories is fixed to some length. We show that the problem can be solved approximately, and with a performance guarantee. More precisely, we present (1 + e)-approximation algorithms for computing the most similar subtrajectories of two input trajectories for the case where the duration is specified, and also for the case where only a minimum on the duration is specified

    Computing Similarity between a Pair of Trajectories

    Full text link
    With recent advances in sensing and tracking technology, trajectory data is becoming increasingly pervasive and analysis of trajectory data is becoming exceedingly important. A fundamental problem in analyzing trajectory data is that of identifying common patterns between pairs or among groups of trajectories. In this paper, we consider the problem of identifying similar portions between a pair of trajectories, each observed as a sequence of points sampled from it. We present new measures of trajectory similarity --- both local and global --- between a pair of trajectories to distinguish between similar and dissimilar portions. Our model is robust under noise and outliers, it does not make any assumptions on the sampling rates on either trajectory, and it works even if they are partially observed. Additionally, the model also yields a scalar similarity score which can be used to rank multiple pairs of trajectories according to similarity, e.g. in clustering applications. We also present efficient algorithms for computing the similarity under our measures; the worst-case running time is quadratic in the number of sample points. Finally, we present an extensive experimental study evaluating the effectiveness of our approach on real datasets, comparing with it with earlier approaches, and illustrating many issues that arise in trajectory data. Our experiments show that our approach is highly accurate in distinguishing similar and dissimilar portions as compared to earlier methods even with sparse sampling

    Path Similarity Analysis: a Method for Quantifying Macromolecular Pathways

    Full text link
    Diverse classes of proteins function through large-scale conformational changes; sophisticated enhanced sampling methods have been proposed to generate these macromolecular transition paths. As such paths are curves in a high-dimensional space, they have been difficult to compare quantitatively, a prerequisite to, for instance, assess the quality of different sampling algorithms. The Path Similarity Analysis (PSA) approach alleviates these difficulties by utilizing the full information in 3N-dimensional trajectories in configuration space. PSA employs the Hausdorff or Fr\'echet path metrics---adopted from computational geometry---enabling us to quantify path (dis)similarity, while the new concept of a Hausdorff-pair map permits the extraction of atomic-scale determinants responsible for path differences. Combined with clustering techniques, PSA facilitates the comparison of many paths, including collections of transition ensembles. We use the closed-to-open transition of the enzyme adenylate kinase (AdK)---a commonly used testbed for the assessment enhanced sampling algorithms---to examine multiple microsecond equilibrium molecular dynamics (MD) transitions of AdK in its substrate-free form alongside transition ensembles from the MD-based dynamic importance sampling (DIMS-MD) and targeted MD (TMD) methods, and a geometrical targeting algorithm (FRODA). A Hausdorff pairs analysis of these ensembles revealed, for instance, that differences in DIMS-MD and FRODA paths were mediated by a set of conserved salt bridges whose charge-charge interactions are fully modeled in DIMS-MD but not in FRODA. We also demonstrate how existing trajectory analysis methods relying on pre-defined collective variables, such as native contacts or geometric quantities, can be used synergistically with PSA, as well as the application of PSA to more complex systems such as membrane transporter proteins.Comment: 9 figures, 3 tables in the main manuscript; supplementary information includes 7 texts (S1 Text - S7 Text) and 11 figures (S1 Fig - S11 Fig) (also available from journal site

    Mining candidate causal relationships in movement patterns

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in the International Journal of Geographical Information Science on 01 October 2013, available online: http://wwww.tandfonline.com/10.1080/13658816.2013.841167In many applications, the environmental context for, and drivers of movement patterns are just as important as the patterns themselves. This paper adapts standard data mining techniques, combined with a foundational ontology of causation, with the objective of helping domain experts identify candidate causal relationships between movement patterns and their environmental context. In addition to data about movement and its dynamic environmental context, our approach requires as input definitions of the states and events of interest. The technique outputs causal and causal-like relationships of potential interest, along with associated measures of support and confidence. As a validation of our approach, the analysis is applied to real data about fish movement in the Murray River in Australia. The results demonstrate the technique is capable of identifying statistically significant patterns of movement indicative of causal and causal-like relationships. 1365-8816Australian Research Council Discovery Projec

    Similarity of trajectories taking into account geographic context

    Get PDF
    The movements of animals, people, and vehicles are embedded in a geographic context. This context influences the movement and may cause the formation of certain behavioral responses. Thus, it is essential to include context parameters in the study of movement and the development of movement pattern analytics. Advances in sensor technologies and positioning devices provide valuable data not only of moving agents but also of the circumstances embedding the movement in space and time. Developing knowledge discovery methods to investigate the relation between movement and its surrounding context is a major challenge in movement analysis today. In this paper we show how to integrate geographic context into the similarity analysis of movement data. For this, we discuss models for geographic context of movement data. Based on this we develop simple but efficient context-aware similarity measures for movement trajectories, which combine a spatial and a contextual distance. These are based on well-known similarity measures for trajectories, such as the Hausdorff, Fréchet, or equal time distance. We validate our approach by applying these measures to movement data of hurricanes and albatross

    Exploring dance movement data using sequence alignment methods

    Get PDF
    Despite the abundance of research on knowledge discovery from moving object databases, only a limited number of studies have examined the interaction between moving point objects in space over time. This paper describes a novel approach for measuring similarity in the interaction between moving objects. The proposed approach consists of three steps. First, we transform movement data into sequences of successive qualitative relations based on the Qualitative Trajectory Calculus (QTC). Second, sequence alignment methods are applied to measure the similarity between movement sequences. Finally, movement sequences are grouped based on similarity by means of an agglomerative hierarchical clustering method. The applicability of this approach is tested using movement data from samba and tango dancers

    New directions in the analysis of movement patterns in space and time

    Get PDF

    Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways

    Get PDF
    abstract: Diverse classes of proteins function through large-scale conformational changes and various sophisticated computational algorithms have been proposed to enhance sampling of these macromolecular transition paths. Because such paths are curves in a high-dimensional space, it has been difficult to quantitatively compare multiple paths, a necessary prerequisite to, for instance, assess the quality of different algorithms. We introduce a method named Path Similarity Analysis (PSA) that enables us to quantify the similarity between two arbitrary paths and extract the atomic-scale determinants responsible for their differences. PSA utilizes the full information available in 3N-dimensional configuration space trajectories by employing the Hausdorff or Fréchet metrics (adopted from computational geometry) to quantify the degree of similarity between piecewise-linear curves. It thus completely avoids relying on projections into low dimensional spaces, as used in traditional approaches. To elucidate the principles of PSA, we quantified the effect of path roughness induced by thermal fluctuations using a toy model system. Using, as an example, the closed-to-open transitions of the enzyme adenylate kinase (AdK) in its substrate-free form, we compared a range of protein transition path-generating algorithms. Molecular dynamics-based dynamic importance sampling (DIMS) MD and targeted MD (TMD) and the purely geometric FRODA (Framework Rigidity Optimized Dynamics Algorithm) were tested along with seven other methods publicly available on servers, including several based on the popular elastic network model (ENM). PSA with clustering revealed that paths produced by a given method are more similar to each other than to those from another method and, for instance, that the ENM-based methods produced relatively similar paths. PSA applied to ensembles of DIMS MD and FRODA trajectories of the conformational transition of diphtheria toxin, a particularly challenging example, showed that the geometry-based FRODA occasionally sampled the pathway space of force field-based DIMS MD. For the AdK transition, the new concept of a Hausdorff-pair map enabled us to extract the molecular structural determinants responsible for differences in pathways, namely a set of conserved salt bridges whose charge-charge interactions are fully modelled in DIMS MD but not in FRODA. PSA has the potential to enhance our understanding of transition path sampling methods, validate them, and to provide a new approach to analyzing conformational transitions.The article is published at http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.100456
    corecore