Search CORE

4 research outputs found

Efficient Data Analytics on Augmented Similarity Triplets

Author: Ahmad Muhammad
Ali Sarwan
Karim Asim
Khan Imdadullah
Shakeel Muhammad Haroon
Zaman Arif
Publication venue
Publication date: 27/12/2019
Field of study

Many machine learning methods (classification, clustering, etc.) start with a known kernel that provides similarity or distance measure between two objects. Recent work has extended this to situations where the information about objects is limited to comparisons of distances between three objects (triplets). Humans find the comparison task much easier than the estimation of absolute similarities, so this kind of data can be easily obtained using crowd-sourcing. In this work, we give an efficient method of augmenting the triplets data, by utilizing additional implicit information inferred from the existing data. Triplets augmentation improves the quality of kernel-based and kernel-free data analytics tasks. Secondly, we also propose a novel set of algorithms for common supervised and unsupervised machine learning tasks based on triplets. These methods work directly with triplets, avoiding kernel evaluations. Experimental evaluation on real and synthetic datasets shows that our methods are more accurate than the current best-known techniques

arXiv.org e-Print Archive

A Revenue Function for Comparison-Based Hierarchical Clustering

Author: Ghoshdastidar Debarghya
Mandal Aishik
Perrot Michaël
Publication venue
Publication date: 02/04/2023
Field of study

Comparison-based learning addresses the problem of learning when, instead of explicit features or pairwise similarities, one only has access to comparisons of the form: \emph{Object

A

is more similar to

B

than to

C

.} Recently, it has been shown that, in Hierarchical Clustering, single and complete linkage can be directly implemented using only such comparisons while several algorithms have been proposed to emulate the behaviour of average linkage. Hence, finding hierarchies (or dendrograms) using only comparisons is a well understood problem. However, evaluating their meaningfulness when no ground-truth nor explicit similarities are available remains an open question. In this paper, we bridge this gap by proposing a new revenue function that allows one to measure the goodness of dendrograms using only comparisons. We show that this function is closely related to Dasgupta's cost for hierarchical clustering that uses pairwise similarities. On the theoretical side, we use the proposed revenue function to resolve the open problem of whether one can approximately recover a latent hierarchy using few triplet comparisons. On the practical side, we present principled algorithms for comparison-based hierarchical clustering based on the maximisation of the revenue and we empirically compare them with existing methods.Comment: 26 pages, 6 figures, 5 tables. Transactions on Machine Learning Research (2023

arXiv.org e-Print Archive

Boosting for Comparison-Based Learning

Author: Luxburg Ulrike von
Perrot Michael
Publication venue: 'SAGE Publications'
Publication date: 01/01/2019
Field of study

We consider the problem of classification in a comparison-based setting: given a set of objects, we only have access to triplet comparisons of the form "object

x_i

is closer to object

x_j

than to object

x_k

." In this paper we introduce TripletBoost, a new method that can learn a classifier just from such triplet comparisons. The main idea is to aggregate the triplets information into weak classifiers, which can subsequently be boosted to a strong classifier. Our method has two main advantages: (i) it is applicable to data from any metric space, and (ii) it can deal with large scale problems using only passively obtained and noisy triplets. We derive theoretical generalization guarantees and a lower bound on the number of necessary triplets, and we empirically show that our method is both competitive with state of the art approaches and resistant to noise.Comment: This is the extended version (38 pages) of a paper accepted to the International Joint Conference on Artificial Intelligence (IJCAI) 201

arXiv.org e-Print Archive

Crossref

Publikationsserver der Universität Tübingen

MPG.PuRe