Search CORE

48,024 research outputs found

Dissimilarity-based Ensembles for Multiple Instance Learning

Author: Cheplygina Veronika
Loog Marco
Tax David M. J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

In multiple instance learning, objects are sets (bags) of feature vectors (instances) rather than individual feature vectors. In this paper we address the problem of how these bags can best be represented. Two standard approaches are to use (dis)similarities between bags and prototype bags, or between bags and prototype instances. The first approach results in a relatively low-dimensional representation determined by the number of training bags, while the second approach results in a relatively high-dimensional representation, determined by the total number of instances in the training set. In this paper a third, intermediate approach is proposed, which links the two approaches and combines their strengths. Our classifier is inspired by a random subspace ensemble, and considers subspaces of the dissimilarity space, defined by subsets of instances, as prototypes. We provide guidelines for using such an ensemble, and show state-of-the-art performances on a range of multiple instance learning problems.Comment: Submitted to IEEE Transactions on Neural Networks and Learning Systems, Special Issue on Learning in Non-(geo)metric Space

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Dealing with non-metric dissimilarities in fuzzy central clustering algorithms

Author: Apostol
Aronszajn
Beni
Bezdek
Cortes
de Cáceres
Denoeux
Diday
Filippone
Filippone
Girolami
Hathaway
Hathaway
Hur
Höppner
Jain
Kaufman
Krishnapuram
Krishnapuram
Krishnapuram
Laub
Laub
Lloyd
Mackay
Maurizio Filippone
Müller
Ng
Roth
Roubens
Ruspini
Saitoh
Schölkopf
Schölkopf
Sneath
Tax
Ward
Windham
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

Clustering is the problem of grouping objects on the basis of a similarity measure among them. Relational clustering methods can be employed when a feature-based representation of the objects is not available, and their description is given in terms of pairwise (dis)similarities. This paper focuses on the relational duals of fuzzy central clustering algorithms, and their application in situations when patterns are represented by means of non-metric pairwise dissimilarities. Symmetrization and shift operations have been proposed to transform the dissimilarities among patterns from non-metric to metric. In this paper, we analyze how four popular fuzzy central clustering algorithms are affected by such transformations. The main contributions include the lack of invariance to shift operations, as well as the invariance to symmetrization. Moreover, we highlight the connections between relational duals of central clustering algorithms and central clustering algorithms in kernel-induced spaces. One among the presented algorithms has never been proposed for non-metric relational clustering, and turns out to be very robust to shift operations. (C) 2008 Elsevier Inc. All rights reserved

CiteSeerX

Elsevier - Publisher Connector

Crossref

Enlighten

White Rose Research Online

On aggregation operators of transitive similarity and dissimilarity relations

Author: Belanche Muñoz Luis Antonio
Orozco Luquero Jorge
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Similarity and dissimilarity are widely used concepts. One of the most studied matters is their combination or aggregation. However, transitivity property is often ignored when aggregating despite being a highly important property, studied by many authors but from different points of view. We collect here some results in preserving transitivity when aggregating, intending to clarify the relationship between aggregation and transitivity and making it useful to design aggregation operators that keep transitivity property. Some examples of the utility of the results are also shown.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Ranking and significance of variable-length similarity-based time series motifs

Author: Arcos Josep Lluis
Corral Álvaro
Serra Isabel
Serrà Joan
Publication venue: 'Elsevier BV'
Publication date: 06/03/2015
Field of study

The detection of very similar patterns in a time series, commonly called motifs, has received continuous and increasing attention from diverse scientific communities. In particular, recent approaches for discovering similar motifs of different lengths have been proposed. In this work, we show that such variable-length similarity-based motifs cannot be directly compared, and hence ranked, by their normalized dissimilarities. Specifically, we find that length-normalized motif dissimilarities still have intrinsic dependencies on the motif length, and that lowest dissimilarities are particularly affected by this dependency. Moreover, we find that such dependencies are generally non-linear and change with the considered data set and dissimilarity measure. Based on these findings, we propose a solution to rank those motifs and measure their significance. This solution relies on a compact but accurate model of the dissimilarity space, using a beta distribution with three parameters that depend on the motif length in a non-linear way. We believe the incomparability of variable-length dissimilarities could go beyond the field of time series, and that similar modeling strategies as the one used here could be of help in a more broad context.Comment: 20 pages, 10 figure

arXiv.org e-Print Archive

Digital.CSIC

Further results on dissimilarity spaces for hyperspectral images RF-CBIR

Author: Datcu Mihai
Graña Manuel
Veganzones Miguel Angel
Publication venue: 'Elsevier BV'
Publication date: 04/07/2013
Field of study

Content-Based Image Retrieval (CBIR) systems are powerful search tools in image databases that have been little applied to hyperspectral images. Relevance feedback (RF) is an iterative process that uses machine learning techniques and user's feedback to improve the CBIR systems performance. We pursued to expand previous research in hyperspectral CBIR systems built on dissimilarity functions defined either on spectral and spatial features extracted by spectral unmixing techniques, or on dictionaries extracted by dictionary-based compressors. These dissimilarity functions were not suitable for direct application in common machine learning techniques. We propose to use a RF general approach based on dissimilarity spaces which is more appropriate for the application of machine learning algorithms to the hyperspectral RF-CBIR. We validate the proposed RF method for hyperspectral CBIR systems over a real hyperspectral dataset.Comment: In Pattern Recognition Letters (2013

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Hal - Université Grenoble Alpes

Evidential relational clustering using medoids

Author: Liu Zhun-Ga
Martin Arnaud
Pan Quan
Zhou Kuang
Publication venue
Publication date: 06/07/2015
Field of study

In real clustering applications, proximity data, in which only pairwise similarities or dissimilarities are known, is more general than object data, in which each pattern is described explicitly by a list of attributes. Medoid-based clustering algorithms, which assume the prototypes of classes are objects, are of great value for partitioning relational data sets. In this paper a new prototype-based clustering method, named Evidential C-Medoids (ECMdd), which is an extension of Fuzzy C-Medoids (FCMdd) on the theoretical framework of belief functions is proposed. In ECMdd, medoids are utilized as the prototypes to represent the detected classes, including specific classes and imprecise classes. Specific classes are for the data which are distinctly far from the prototypes of other classes, while imprecise classes accept the objects that may be close to the prototypes of more than one class. This soft decision mechanism could make the clustering results more cautious and reduce the misclassification rates. Experiments in synthetic and real data sets are used to illustrate the performance of ECMdd. The results show that ECMdd could capture well the uncertainty in the internal data structure. Moreover, it is more robust to the initializations compared with FCMdd.Comment: in The 18th International Conference on Information Fusion, July 2015, Washington, DC, USA , Jul 2015, Washington, United State

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Ward's Hierarchical Clustering Method: Clustering Criterion and Agglomerative Algorithm

Author: AK JAIN
B ROUX LE
D WISHART
F MURTAGH
F MURTAGH
F MURTAGH
F MURTAGH
Fionn Murtagh
GJ SZÉKELY
GN LANCE
JC GOWER
JH WARD
JP BENZÉCRI
L KAUFMAN
L ORLÓCI
M BRUYNOOGHE
M JAMBU
M JAMBU
MR ANDERBERG
P LEGENDRE
P LEGENDRE
Pierre Legendre
RA FISHER
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/12/2011
Field of study

The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. However there are different interpretations in the literature and there are different implementations of the Ward agglomerative algorithm in commonly used software systems, including differing expressions of the agglomerative criterion. Our survey work and case studies will be useful for all those involved in developing software for data analysis using Ward's hierarchical clustering method.Comment: 20 pages, 21 citations, 4 figure

arXiv.org e-Print Archive

Goldsmiths Research Online

Crossref

De Montfort University Open Research Archive