Search CORE

16,387 research outputs found

Computing Bi-Lipschitz Outlier Embeddings into the Line

Author: Chubarian Karine
Sidiropoulos Anastasios
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2020)
Publication date: 01/01/2020
Field of study

The problem of computing a bi-Lipschitz embedding of a graphical metric into the line with minimum distortion has received a lot of attention. The best-known approximation algorithm computes an embedding with distortion

O(c^2)

, where

c

denotes the optimal distortion [B\u{a}doiu \etal~2005]. We present a bi-criteria approximation algorithm that extends the above results to the setting of \emph{outliers}. Specifically, we say that a metric space

(X,\rho)

admits a

(k,c)

-embedding if there exists

K\subset X

, with

|K|=k

, such that

(X\setminus K, \rho)

admits an embedding into the line with distortion at most

c

. Given

k\geq 0

, and a metric space that admits a

(k,c)

-embedding, for some

c\geq 1

, our algorithm computes a

({\mathsf p}{\mathsf o}{\mathsf l}{\mathsf y}(k, c, \log n), {\mathsf p}{\mathsf o}{\mathsf l}{\mathsf y}(c))

-embedding in polynomial time. This is the first algorithmic result for outlier bi-Lipschitz embeddings. Prior to our work, comparable outlier embeddings where known only for the case of additive distortion

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Composition of nested embeddings with an application to outlier removal

Author: Chawla Shuchi
Sheridan Kristin
Publication venue
Publication date: 27/06/2023
Field of study

We study the design of embeddings into Euclidean space with outliers. Given a metric space

(X,d)

and an integer

k

, the goal is to embed all but

k

points in

X

(called the "outliers") into

\ell_2

with the smallest possible distortion

c

. Finding the optimal distortion

c

for a given outlier set size

k

, or alternately the smallest

k

for a given target distortion

c

are both NP-hard problems. In fact, it is UGC-hard to approximate

k

to within a factor smaller than

2

even when the metric sans outliers is isometrically embeddable into

\ell_2

. We consider bi-criteria approximations. Our main result is a polynomial time algorithm that approximates the outlier set size to within an

O(\log^4 k)

factor and the distortion to within a constant factor. The main technical component in our result is an approach for constructing a composition of two given embeddings from subsets of

X

into

\ell_2

which inherits the distortions of each to within small multiplicative factors. Specifically, given a low

c_S

distortion embedding from

S\subset X

into

\ell_2

and a high(er)

c_X

distortion embedding from the entire set

X

into

\ell_2

, we construct a single embedding that achieves the same distortion

c_S

over pairs of points in

S

and an expansion of at most

O(\log k)\cdot c_X

over the remaining pairs of points, where

k=|X\setminus S|

. Our composition theorem extends to embeddings into arbitrary

\ell_p

metrics for

p\ge 1

, and may be of independent interest. While unions of embeddings over disjoint sets have been studied previously, to our knowledge, this is the first work to consider compositions of nested embeddings.Comment: 25 pages (including 2 appendices), 5 figure

arXiv.org e-Print Archive