Search CORE

1,114 research outputs found

Random projections for high-dimensional curves

Author: Psarros Ioannis
Rohde Dennis
Publication venue
Publication date: 15/07/2022
Field of study

Modern time series analysis requires the ability to handle datasets that are inherently high-dimensional; examples include applications in climatology, where measurements from numerous sensors must be taken into account, or inventory tracking of large shops, where the dimension is defined by the number of tracked items. The standard way to mitigate computational issues arising from the high-dimensionality of the data is by applying some dimension reduction technique that preserves the structural properties of the ambient space. The dissimilarity between two time series is often measured by ``discrete'' notions of distance, e.g. the dynamic time warping, or the discrete Fr\'echet distance, or simply the Euclidean distance. Since all these distance functions are computed directly on the points of a time series, they are sensitive to different sampling rates or gaps. The continuous Fr\'echet distance offers a popular alternative which aims to alleviate this by taking into account all points on the polygonal curve obtained by linearly interpolating between any two consecutive points in a sequence. We study the ability of random projections \`a la Johnson and Lindenstrauss to preserve the continuous Fr\'echet distance of polygonal curves by effectively reducing the dimension. In particular, we show that one can reduce the dimension to

O(\epsilon^{-2} \log N)

, where

N

is the total number of input points while preserving the continuous Fr\'echet distance between any two determined polygonal curves within a factor of

1\pm \epsilon

. We conclude with applications on clustering.Comment: 22 page

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximating $(k,\ell)$ -center clustering for curves

Author: Buchin Kevin
Chan Timothy M.
Driemel Anne
Gudmundsson Joachim
Horton Michael
Kostitsyna Irina
Löffler Maarten
Struijs Martijn
Publication venue
Publication date: 01/01/2018
Field of study

The Euclidean

k

-center problem is a classical problem that has been extensively studied in computer science. Given a set

\mathcal{G}

n

points in Euclidean space, the problem is to determine a set

\mathcal{C}

k

centers (not necessarily part of

\mathcal{G}

) such that the maximum distance between a point in

\mathcal{G}

and its nearest neighbor in

\mathcal{C}

is minimized. In this paper we study the corresponding

(k,\ell)

-center problem for polygonal curves under the Fr\'echet distance, that is, given a set

\mathcal{G}

n

polygonal curves in

\mathbb{R}^d

, each of complexity

m

, determine a set

\mathcal{C}

k

polygonal curves in

\mathbb{R}^d

, each of complexity

\ell

, such that the maximum Fr\'echet distance of a curve in

\mathcal{G}

to its closest curve in

\mathcal{C}

is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension

2

and higher. We show that, if

\ell

is part of the input, then there is no polynomial-time approximation scheme unless

\mathsf{P}=\mathsf{NP}

. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fr\'echet distance. In the case of the discrete Fr\'echet distance on two-dimensional curves, we show hardness of approximation within a factor close to

2.598

. This result also holds when

k=1

, and the

\mathsf{NP}

-hardness extends to the case that

\ell=\infty

, i.e., for the problem of computing the minimum-enclosing ball under the Fr\'echet distance. Finally, we observe that a careful adaptation of Gonzalez' algorithm in combination with a curve simplification yields a

3

-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.Comment: 24 pages; results on minimum-enclosing ball added, additional author added, general revisio

arXiv.org e-Print Archive

Repository TU/e

Crossref

Pure OAI Repository

On the Hardness of Computing an Average Curve

Author: Buchin Kevin
Driemel Anne
Struijs Martijn
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 17th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2020)
Publication date: 01/01/2019
Field of study

We study the complexity of clustering curves under

k

-median and

k

-center objectives in the metric space of the Fr\'echet distance and related distance measures. Building upon recent hardness results for the minimum-enclosing-ball problem under the Fr\'echet distance, we show that also the

1

-median problem is NP-hard. Furthermore, we show that the

1

-median problem is W[1]-hard with the number of curves as parameter. We show this under the discrete and continuous Fr\'echet and Dynamic Time Warping (DTW) distance. This yields an independent proof of an earlier result by Bulteau et al. from 2018 for a variant of DTW that uses squared distances, where the new proof is both simpler and more general. On the positive side, we give approximation algorithms for problem variants where the center curve may have complexity at most

\ell

under the discrete Fr\'echet distance. In particular, for fixed

k,\ell

and

\varepsilon

, we give

(1+\varepsilon)

-approximation algorithms for the

(k,\ell)

-median and

(k,\ell)

-center objectives and a polynomial-time exact algorithm for the

(k,\ell)

-center objective

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Dagstuhl Research Online Publication Server

Using time-series similarity measures to compare animal movement trajectories in ecology

Author: Bearhop S
Bodey TW
Cleasby IR
Hamer KC
Morrissey BJ
Votier SC
Wakefield ED
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Identifying and understanding patterns in movement data are amongst the principal aims of movement ecology. By quantifying the similarity of movement trajectories, inferences can be made about diverse processes, ranging from individual specialisation to the ontogeny of foraging strategies. Movement analysis is not unique to ecology however, and methods for estimating the similarity of movement trajectories have been developed in other fields but are currently under-utilised by ecologists. Here, we introduce five commonly used measures of trajectory similarity: dynamic time warping (DTW), longest common subsequence (LCSS), edit distance for real sequences (EDR), Fréchet distance and nearest neighbour distance (NND), of which only NND is routinely used by ecologists. We investigate the performance of each of these measures by simulating movement trajectories using an Ornstein-Uhlenbeck (OU) model in which we varied the following parameters: (1) the point of attraction, (2) the strength of attraction to this point and (3) the noise or volatility added to the movement process in order to determine which measures were most responsive to such changes. In addition, we demonstrate how these measures can be applied using movement trajectories of breeding northern gannets (Morus bassanus) by performing trajectory clustering on a large ecological dataset. Simulations showed that DTW and Fréchet distance were most responsive to changes in movement parameters and were able to distinguish between all the different parameter combinations we trialled. In contrast, NND was the least sensitive measure trialled. When applied to our gannet dataset, the five similarity measures were highly correlated despite differences in their underlying calculation. Clustering of trajectories within and across individuals allowed us to easily visualise and compare patterns of space use over time across a large dataset. Trajectory clusters reflected the bearing on which birds departed the colony and highlighted the use of well-known bathymetric features. As both the volume of movement data and the need to quantify similarity amongst animal trajectories grow, the measures described here and the bridge they provide to other fields of research will become increasingly useful in ecology

Aberdeen University Research

Heriot Watt Pure

Enlighten

White Rose Research Online

Locality-Sensitive Hashing of Curves

Author: Driemel Anne
Silvestri Francesco
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 33rd International Symposium on Computational Geometry (SoCG 2017)
Publication date: 01/01/2017
Field of study

We study data structures for storing a set of polygonal curves in

{\rm R}^d

such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fr\'echet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk from 2002 for short curves. Let

n

be the number of input curves and let

m

be the maximum complexity of a curve in the input. In the particular case where

m \leq \frac{\alpha}{4d} \log n

, for some fixed

\alpha>0

, our solutions imply an approximate near-neighbor data structure for the discrete Fr\'echet distance that uses space in

O(n^{1+\alpha}\log n)

and achieves query time in

O(n^{\alpha}\log^2 n)

and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter

k \in [m]

, we can give a data structure that uses space in

O(2^{2k}m^{k-1} n \log n + nm)

, answers queries in

O( 2^{2k} m^{k}\log n)

time and achieves approximation factor in

O(m/k)

.Comment: Proc. of 33rd International Symposium on Computational Geometry (SoCG), 201

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Dagstuhl Research Online Publication Server

Archivio istituzionale della ricerca - Università di Padova

Statistical M-Estimation and Consistency in Large Deformable Models for Image Warping

Author: A. Antoniadis
A. Trouvé
A. Trouvé
A. Waart Van der
A. Waart Van der
A.K. Jain
A.P. Korostelëv
B. Markussen
B. Markussen
B. Schölkopf
C. Boor De
C.A. Glasbey
D.G. Kendall
E. Candès
F. Gamboa
G. Charpiat
G. Charpiat
I.L. Dryden
J. Glaunès
J.-M. Loubes
J.B. MacQueen
Jean-Michel Loubes
Jérémie Bigot
L. Huilling
L. Younes
M. Vaillant
O. Faugeras
R.J. Biscay
S. Allassonière
S. Mallat
S.A. Geer van de
Sébastien Gadat
T. Hastie
U. Grenander
U. Grenander
V. Vapnik
X. Pennec
Y. Amit
Y. Amit
Y. LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

The problem of defining appropriate distances between shapes or images and modeling the variability of natural images by group transformations is at the heart of modern image analysis. A current trend is the study of probabilistic and statistical aspects of deformation models, and the development of consistent statistical procedure for the estimation of template images. In this paper, we consider a set of images randomly warped from a mean template which has to be recovered. For this, we define an appropriate statistical parametric model to generate random diffeomorphic deformations in two-dimensions. Then, we focus on the problem of estimating the mean pattern when the images are observed with noise. This problem is challenging both from a theoretical and a practical point of view. M-estimation theory enables us to build an estimator defined as a minimizer of a well-tailored empirical criterion. We prove the convergence of this estimator and propose a gradient descent algorithm to compute this M-estimator in practice. Simulations of template extraction and an application to image clustering and classification are also provided

Crossref

Open Archive Toulouse Archive Ouverte