148 research outputs found

    A Heuristic Based on the Intrinsic Dimensionality for Reducing the Number of Cyclic DTW Comparisons in Shape Classification and Retrieval Using AESA

    Get PDF
    Cyclic Dynamic Time Warping (CDTW) is a good dissimilarity of shape descriptors of high dimensionality based on contours, but it is computationally expensive. For this reason, to perform recognition tasks, a method to reduce the number of comparisons and avoid an exhaustive search is convenient. The Approximate and Eliminate Search Algorithm (AESA) is a relevant indexing method because of its drastic reduction of comparisons, however, this algorithm requires a metric distance and that is not the case of CDTW. In this paper, we introduce a heuristic based on the intrinsic dimensionality that allows to use CDTW and AESA together in classification and retrieval tasks over these shape descriptors. Experimental results show that, for descriptors of high dimensionality, our proposal is optimal in practice and significantly outperforms an exhaustive search, which is the only alternative for them and CDTW in these tasks

    Improving the robustness and reliability of population-based global biodiversity indicators

    Get PDF
    The current global biodiversity crisis is complicated by a data crisis. Reliable tools are needed to guide scientific research and conservation policy decisions, but the data underlying those tools is incomplete and biased. For example, the Living Planet Index (LPI) tracks the changing status of global vertebrate biodiversity, but gaps, biases and quality issues plague the aggregated data used to calculate trends. Unfortunately, we have little understanding of how reliable biodiversity indicators are. In this thesis I develop a suite of tools to assess and improve the reliability of trends in the LPI and similar indicators. First, I explore distance measures as a flexible toolset for comparing time series and trends. I test distance measures for properties related to time series comparisons and rate their relative sensitivities, then expand the results into a framework for choosing an appropriate distance measure for any time series comparison task in ecology. I use the framework to select an appropriate metric for determining trend accuracy. Second, I construct a model of trend reliability from accuracy measurements of sampled trend replicates calculated from artificially generated time series datasets. I apply the model to the LPI to reveal that the majority of trends need more data to be considered reliable, particularly across the global south, and for reptiles and amphibians everywhere. Finally, I develop a method to account for sampling error and serial correlation in confidence intervals of indicators that use aggregated abundance data from different sources. I show that the new method results in more robust and accurate confidence intervals across a wide range of dataset parameters, without reducing trend accuracy. I also apply the method to the LPI to reveal that the current method used by the LPI results in inaccurate and overly wide confidence intervals

    Similarity Search for Spatial Trajectories Using Online Lower Bounding DTW and Presorting Strategies

    Get PDF
    Similarity search with respect to time series has received much attention from research and industry in the last decade. Dynamic time warping is one of the most widely used distance measures in this context. This is due to the simplicity of its definition and the surprising quality of dynamic time warping for time series classification. However, dynamic time warping is not well-behaving with respect to many dimensionality reduction techniques as it does not fulfill the triangle inequality. Additionally, most research on dynamic time warping has been performed with one-dimensional time series or in multivariate cases of varying dimensions. With this paper, we propose three extensions to LB_Rotation for two-dimensional time series (trajectories). We simplify LB_Rotation and adapt it to the online and data streaming case and show how to tune the pruning ratio in similarity search by using presorting strategies based on simple summaries of trajectories. Finally, we provide a thorough valuation of these aspects on a large variety of datasets of spatial trajectories

    On clustering and related problems on curves under the Fréchet distance

    Get PDF
    Sensor measurements can be represented as points in Rd. Ordered by the time-stamps of these measurements, they yield a time series, that can be interpreted as a polygonal curve in the d-dimensional ambient space. The number of the vertices is called complexity of the curve. In this thesis we study several fundamental computational tasks on curves: clustering, sim- pliïŹcation, and embedding, under the FrÂŽechet distance, which is a popular distance measure for curves, in its continuous and discrete version. We focus on curves in one-dimensional ambient space R. We study the problem of clustering of the curves in R under the FrÂŽechet distance, in particular, the following variations of the well-known k-center and k- median problems. Given is a set P of n curves in R, each of complexity at most m. Our goal is to ïŹnd k curves in R, not necessarily from P , called cluster centers and that each has complexity at most R. In the (k, R)-center problem, the maximum distance of an element of P to its nearest cluster center is minimized. In the (k, R)-median problem, the sum of these distances is minimized. We show that both problems are NP-hard under both versions of the FrÂŽechet distance, if k is part of the input. Under the continuous FrÂŽechet distance, we give (1 + Δ)-approximation algorithms for both (k, R)-center and (k, R)-median problem, with running time near-linear in the input size for constant Δ, k and R. Our techniques yield constant-factor approximation algorithms for the observed problems under the discrete FrÂŽechet distance. To obtain the (1 + Δ)-approximation algorithms for the clustering prob- lems under the continuous FrÂŽechet distance, we develop a new simpliïŹcation technique on one-dimensional curve, called ÎŽ-signature. The signatures al- ways exist, and we can compute them eïŹƒciently. We also study the problem of embedding of the FrÂŽechet distance into space R. We show that, in the worst case and under reasonable assumptions, the discrete FrÂŽechet distance between two polygonal curves of complexity m in Rd, where 2 ≀ d ≀ 7, degrades by a factor linear in m with constant probability, when the curves are projected onto a randomly chosen line. We show upper and lower bounds on the distortion. Sensor measurements can also deïŹne a discrete distribution over possi- ble locations of a point in Rd. Then, the input consists of n probabilistic points. We study the probabilistic 1-center problem in Euclidean space Rd, also known as the probabilistic smallest enclosing ball (pSEB) problem. To improve the best existing algorithm for the pSEB problem by reducing its exponential dependence on the dimension to linear, we study the determinis- tic set median problem, that generalizes both the 1-center and the 1-median problems. We present a (1 + Δ)-approximation algorithm for the set median problem, using a novel combination of sampling techniques and stochastic subgradient descent. Our (1 + Δ)-approximation algorithm for the pSEB problem takes linear time in d and n, making the pSEB algorithm applicable to shape ïŹtting problems in Hilbert spaces of unbounded dimension using kernel functions. We present an exemplary application by extending the support vector data description (SVDD) shape ïŹtting method to the probabilistic case

    KÀyttÀjien jÀljittÀminen ja kannusteiden hallinta ÀlykkÀissÀ liikennejÀrjestelmissÀ

    Get PDF
    A system for offering incentives for ecological modes of transport is presented. The main focus is on the verification of claims of having taken a trip on such a mode of transport. Three components are presented for the task of travel mode identification: A system to select features, a means to measure a GPS (Global Positioning System) trace's similarity to a bus route, and finally a machine-learning approach to the actual identification. Feature selection is carried out by sorting the features according to statistical significance, and eliminating correlating features. The novel features considered are skewnesses, kurtoses, auto- and cross correlations, and spectral components of speed and acceleration. Of these, only spectral components are found to be particularly useful in classification. Bus route similarity is measured by using a novel indexing structure called MBR-tree, short for "Multiple Bounding Rectangle", to find the most similar bus traces. The MBR-tree is an expansion of the R-tree for sequences of bounding rectangles, based on an estimation method for longest common subsequence that uses such sequences. A second option of decomposing traces to sequences of direction-distance-duration-triples and indexing them in an M-tree using edit distance with real penalty is considered but shown to perform poorly. For machine learning, the methods considered are Bayes classification, random forest, and feedforward neural networks with and without autoencoders. Autoencoder neural networks are shown to perform perplexingly poorly, but the other methods perform close to the state-of-the-art. Methods for obfuscating the user's location, and constructing secure electronic coupons, are also discussed

    An attention model and its application in man-made scene interpretation

    No full text
    The ultimate aim of research into computer vision is designing a system which interprets its surrounding environment in a similar way the human can do effortlessly. However, the state of technology is far from achieving such a goal. In this thesis different components of a computer vision system that are designed for the task of interpreting man-made scenes, in particular images of buildings, are described. The flow of information in the proposed system is bottom-up i.e., the image is first segmented into its meaningful components and subsequently the regions are labelled using a contextual classifier. Starting from simple observations concerning the human vision system and the gestalt laws of human perception, like the law of “good (simple) shape” and “perceptual grouping”, a blob detector is developed, that identifies components in a 2D image. These components are convex regions of interest, with interest being defined as significant gradient magnitude content. An eye tracking experiment is conducted, which shows that the regions identified by the blob detector, correlate significantly with the regions which drive the attention of viewers. Having identified these blobs, it is postulated that a blob represents an object, linguistically identified with its own semantic name. In other words, a blob may contain a window a door or a chimney in a building. These regions are used to identify and segment higher order structures in a building, like facade, window array and also environmental regions like sky and ground. Because of inconsistency in the unary features of buildings, a contextual learning algorithm is used to classify the segmented regions. A model which learns spatial and topological relationships between different objects from a set of hand-labelled data, is used. This model utilises this information in a MRF to achieve consistent labellings of new scenes

    Drawing, Handwriting Processing Analysis: New Advances and Challenges

    No full text
    International audienceDrawing and handwriting are communicational skills that are fundamental in geopolitical, ideological and technological evolutions of all time. drawingand handwriting are still useful in defining innovative applications in numerous fields. In this regard, researchers have to solve new problems like those related to the manner in which drawing and handwriting become an efficient way to command various connected objects; or to validate graphomotor skills as evident and objective sources of data useful in the study of human beings, their capabilities and their limits from birth to decline
    • 

    corecore