Search CORE

1,762 research outputs found

Dynamic Clustering of Histogram Data Based on Adaptive Squared Wasserstein Distances

Author: Ahmad
Antonio Irpino
Bock
Calinski
Calo
Celeux
Chan
Chen
Clark
Cuesta-Albertos
De Carvalho
De Carvalho
De Carvalho
De Carvalho
De Souza
Deng
Diday
Diday
Francisco de A.T. De Carvalho
Friedman
Frigui
Gibbs
Huang
Hubert
Irpino
Irpino
Jain
Jing
Johnson
Levina
Mallows
Milligan
Rosanna Verde
Rubner
Rüshendorff
Terada
Tsai
Verde
Verde
Verde
Villani
Vrac
Xu
Publication venue: 'Elsevier BV'
Publication date: 07/10/2011
Field of study

This paper deals with clustering methods based on adaptive distances for histogram data using a dynamic clustering algorithm. Histogram data describes individuals in terms of empirical distributions. These kind of data can be considered as complex descriptions of phenomena observed on complex objects: images, groups of individuals, spatial or temporal variant data, results of queries, environmental data, and so on. The Wasserstein distance is used to compare two histograms. The Wasserstein distance between histograms is constituted by two components: the first based on the means, and the second, to internal dispersions (standard deviation, skewness, kurtosis, and so on) of the histograms. To cluster sets of histogram data, we propose to use Dynamic Clustering Algorithm, (based on adaptive squared Wasserstein distances) that is a k-means-like algorithm for clustering a set of individuals into

K

classes that are apriori fixed. The main aim of this research is to provide a tool for clustering histograms, emphasizing the different contributions of the histogram variables, and their components, to the definition of the clusters. We demonstrate that this can be achieved using adaptive distances. Two kind of adaptive distances are considered: the first takes into account the variability of each component of each descriptor for the whole set of individuals; the second takes into account the variability of each component of each descriptor in each cluster. We furnish interpretative tools of the obtained partition based on an extension of the classical measures (indexes) to the use of adaptive distances in the clustering criterion function. Applications on synthetic and real-world data corroborate the proposed procedure

arXiv.org e-Print Archive

Crossref

IDENTIFICATION OF COVER SONGS USING INFORMATION THEORETIC MEASURES OF SIMILARITY

Author: Dixon S
Foster P
IEEE
Klapuri A
Publication venue
Publication date: 01/01/2013
Field of study

13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted version13 pages, 5 figures, 4 tables. v3: Accepted versio

Queen Mary Research Online

Recommended from our members

Adapting Metrics for Music Similarity Using Comparative Ratings

Author: Weyde T.
Wolff D.
Publication venue
Publication date: 01/01/2011
Field of study

City Research Online

3rd Workshop in Symbolic Data Analysis: book of abstracts

Author: Arroyo Gallardo Javier, ed.
Brito Paula, ed.
Maté Carlos, ed.
Noirhomme-Fraiture Monique, ed.
Publication venue
Publication date: 01/01/2012
Field of study

This workshop is the third regular meeting of researchers interested in Symbolic Data Analysis. The main aim of the event is to favor the meeting of people and the exchange of ideas from different fields - Mathematics, Statistics, Computer Science, Engineering, Economics, among others - that contribute to Symbolic Data Analysis

Docta Complutense

Analysis of Trajectories by Preserving Structural Information

Author: Jawad Ahmed
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

The analysis of trajectories from traffic data is an established and yet fast growing area of research in the related fields of Geo-analytics and Geographic Information Systems (GIS). It has a broad range of applications that impact lives of millions of people, e.g., in urban planning, transportation and navigation systems and localized search methods. Most of these applications share some underlying basic tasks which are related to matching, clustering and classification of trajectories. And, these tasks in turn share some underlying problems, i.e., dealing with the noisy and variable length spatio-temporal sequences in the wild. In our view, these problems can be handled in a better manner by exploiting the spatio-temporal relationships (or structural information) in sampled trajectory points that remain considerably unharmed during the measurement process. Although, the usage of such structural information has allowed breakthroughs in other fields related to the analysis of complex data sets [18], surprisingly, there is no existing approach in trajectory analysis that looks at this structural information in a unified way across multiple tasks. In this thesis, we build upon these observations and give a unified treatment of structural information in order to improve trajectory analysis tasks. This treatment explores for the first time that sequences, graphs, and kernels are common to machine learning and geo-analytics. This common language allows to pool the corresponding methods and knowledge to help solving the challenges raised by the ever growing amount of movement data by developing new analysis models and methods. This is illustrated in several ways. For example, we introduce new problem settings, distance functions and a visualization scheme in the area of trajectory analysis. We also connect the broad fild of kernel methods to the analysis of trajectories, and, we strengthen and revisit the link between biological sequence methods and analysis of trajectories. Finally, the results of our experiments show that - by incorporating the structural information - our methods improve over state-of-the-art in the focused tasks, i.e., map matching, clustering and traffic event detection

bonndoc – Der Publikationsserver der Universität Bonn