Search CORE

143 research outputs found

Clustering and Inconsistent Information: A Kernelization Approach

Author: Cao Yixin
Publication venue
Publication date
Field of study

Clustering is the unsupervised classification of patterns into groups, which is easy provided the data of patterns are consistent. However, real data are almost always tempered with inconsistencies, which make it a hard problem, and actually, the most widely studied formulations, correlation clustering and hierarchical clustering, are both NP-hard. In the graph representation of data, inconsistencies also frequently present themselves as cycles, also called deadlocks, and to break cycles by removing vertices is the objective of the classical feedback vertex set (FVS) problem. This dissertation studies the three problems, correlation clustering, hierarchical clustering, and disjoint-FVS (a variation of FVS), from a kernelization approach. A kernelization algorithm in polynomial time reduces a problem instance provably to speed up the further processing with other approaches. For each of the problems studied, an efficient kernelization algorithm of linear or sub-quadratic running time is presented. All the kernels obtained in this dissertation have linear size with very small constants. Better parameterized algorithms are also designed based on the kernels for the last two problems. Finally, some concluding remarks on possible directions for future research are briefly mentioned

Texas A&M Repository

A Survey on Metric Learning for Feature Vectors and Structured Data

Author: Bellet Aurélien
Habrard Amaury
Sebban Marc
Publication venue
Publication date: 01/01/2013
Field of study

The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

arXiv.org e-Print Archive

HAL-UJM

Multivariate Analysis of Clustering Problems with Constraints

Author: Purohit Nidhi
Publication venue: The University of Bergen
Publication date: 15/11/2023
Field of study

Doktorgradsavhandlin

University of Bergen

Cluster Editing with Vertex Splitting

Author: A Fadiel
F Dehne
F Hüffner
FN Abu-Khzam
FN Abu-Khzam
G-H Lin
HL Bodlaender
J Chen
J Chen
J Flum
J Gramm
J Guo
L Cai
M Cygan
M D’Addario
M Fellows
MR Fellows
N Tomašev
R Niedermeier
R Shamir
RG Downey
S Böcker
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

OPUS - University of Technology Sydney

A survey of parameterized algorithms and the complexity of edge modification

Author: Crespelle Christophe
Drange Pål Grønås
Fomin Fedor
Golovach Petr
Publication venue: Elsevier
Publication date: 01/01/2023
Field of study

The survey is a comprehensive overview of the developing area of parameterized algorithms for graph modification problems. It describes state of the art in kernelization, subexponential algorithms, and parameterized complexity of graph modification. The main focus is on edge modification problems, where the task is to change some adjacencies in a graph to satisfy some required properties. To facilitate further research, we list many open problems in the area.publishedVersio

University of Bergen

Parameterized Approximation Algorithms in Network Design and Clustering

Author: Feldmann Andreas Emil
Publication venue
Publication date: 05/11/2021
Field of study

CU Digital Repository