Search CORE

8,111 research outputs found

Low-Rank Binary Matrix Approximation in Column-Sum Norm

Author: Fomin Fedor V.
Golovach Petr A.
Panolan Fahad
Simonov Kirill
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)
Publication date: 12/04/2019
Field of study

We consider

\ell_1

-Rank-

r

Approximation over GF(2), where for a binary

m\times n

matrix

{\bf A}

and a positive integer

r

, one seeks a binary matrix

{\bf B}

of rank at most

r

, minimizing the column-sum norm

||{\bf A} -{\bf B}||_1

. We show that for every

\varepsilon\in (0, 1)

, there is a randomized

(1+\varepsilon)

-approximation algorithm for

\ell_1

-Rank-

r

Approximation over GF(2) of running time

m^{O(1)}n^{O(2^{4r}\cdot \varepsilon^{-4})}

. This is the first polynomial time approximation scheme (PTAS) for this problem

arXiv.org e-Print Archive

University of Bergen

Dagstuhl Research Online Publication Server

Research Archive of Indian Institute of Technology Hyderabad

NORA - Norwegian Open Research Archives

A Survey on Metric Learning for Feature Vectors and Structured Data

Author: Bellet Aurélien
Habrard Amaury
Sebban Marc
Publication venue
Publication date: 01/01/2013
Field of study

The need for appropriate ways to measure the distance or similarity between data is ubiquitous in machine learning, pattern recognition and data mining, but handcrafting such good metrics for specific problems is generally difficult. This has led to the emergence of metric learning, which aims at automatically learning a metric from data and has attracted a lot of interest in machine learning and related fields for the past ten years. This survey paper proposes a systematic review of the metric learning literature, highlighting the pros and cons of each approach. We pay particular attention to Mahalanobis distance metric learning, a well-studied and successful framework, but additionally present a wide range of methods that have recently emerged as powerful alternatives, including nonlinear metric learning, similarity learning and local metric learning. Recent trends and extensions, such as semi-supervised metric learning, metric learning for histogram data and the derivation of generalization guarantees, are also covered. Finally, this survey addresses metric learning for structured data, in particular edit distance learning, and attempts to give an overview of the remaining challenges in metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new method

arXiv.org e-Print Archive

HAL-UJM

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Author: Munteanu Alexander
Schwiegelshohn Chris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

Archivio della ricerca- Università di Roma La Sapienza

Approximating the Center Ranking Under Ulam

Author: Chakraborty Diptarka
Gajjar Kshitij
Jha Agastya Vibhuti
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 41st IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

On Finding the Jaccard Center

Author: Bury Marc
Schwiegelshohn Chris
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017)
Publication date: 01/01/2017
Field of study

We initiate the study of finding the Jaccard center of a given collection N of sets. For two sets X,Y, the Jaccard index is defined as |Xcap Y|/|Xcup Y| and the corresponding distance is 1-|Xcap Y|/|Xcup Y|. The Jaccard center is a set C minimizing the maximum distance to any set of N. We show that the problem is NP-hard to solve exactly, and that it admits a PTAS while no FPTAS can exist unless P = NP. Furthermore, we show that the problem is fixed parameter tractable in the maximum Hamming norm between Jaccard center and any input set. Our algorithms are based on a compression technique similar in spirit to coresets for the Euclidean 1-center problem. In addition, we also show that, contrary to the previously studied median problem by Chierichetti et al. (SODA 2010), the continuous version of the Jaccard center problem admits a simple polynomial time algorithm

Dagstuhl Research Online Publication Server

Archivio della ricerca- Università di Roma La Sapienza