Search CORE

337 research outputs found

Max-sum diversity via convex programming

Author: Cevallos Alfonso
Eisenbrand Friedrich
Zenklusen Rico
Publication venue
Publication date: 22/11/2015
Field of study

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function

f(\cdot)

that measures diversity of a subset, the task is to select a feasible subset

S

such that

f(S)

is maximized. The \emph{sum-dispersion} function

f(S) = \sum_{x,y \in S} d(x,y)

, which is the sum of the pairwise distances in

S

, is in this context a prominent diversification measure. The corresponding diversity maximization is the \emph{max-sum} or \emph{sum-sum diversification}. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversification problem under a matroid constraint for distances

d(\cdot,\cdot)

of \emph{negative type}. Distances of negative type are, for example, metric distances stemming from the

\ell_2

and

\ell_1

norm, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search

arXiv.org e-Print Archive

Repository for Publications and Research Data

Dagstuhl Research Online Publication Server

Sublinear quasiconformality and the large-scale geometry of Heintze groups

Author: Pallier Gabriel
Publication venue
Publication date: 24/01/2020
Field of study

This article analyzes sublinearly quasisymmetric homeo-morphisms (generalized quasisymmetric mappings), and draws applications to the sublinear large-scale geometry of negatively curved groups and spaces. It is proven that those homeomorphisms lack analytical properties but preserve a conformal dimension and appropriate function spaces, distinguishing certain (nonsymmetric) Riemannian negatively curved homogeneous spaces, and Fuchsian buildings, up to sublinearly biLipschitz equivalence (generalized quasiisometry).Comment: v1->v2: shortened, revised. Lemma 2.3 and definition of Cdim corrected. Proof of main theorem simplified. Figure 4 adde

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Contributions on metric spaces with applications in personalized medicine

Author: Matabuena Rodríguez Marcos
Publication venue
Publication date: 01/01/2022
Field of study

Esta tesis tiene como objetivo proponer nuevas representaciones distribucionales y métodos estadísticos en espacios métricos para modelar de forma eficaz los datos procedentes de la monitorización continua de los pacientes durante las actividades propias de su vida diaria. Proponemos nuevas pruebas de hipótesis para datos emparejados, modelos de regresión, algoritmos de cuantificación de la incertidumbre, pruebas de independencia estadística y algoritmos de análisis de conglomerados para las nuevas representaciones distribucionales y otros objetos estadísticos complejos. Los diferentes resultados recogidos a lo largo de la tesis muestran las ventajas en términos de predicción, interpretabilidad y capacidad de modelización de las nuevas propuestas frente a los metodos existentes

Repositorio Institucional da Universidade de Santiago de Compostela

Distance Measures for Embedded Graphs

Author: Akitaya Hugo A.
Buchin Maike
Kilgus Bernhard
Sijben Stef
Wenk Carola
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

We introduce new distance measures for comparing straight-line embedded graphs based on the Fr\'echet distance and the weak Fr\'echet distance. These graph distances are defined using continuous mappings and thus take the combinatorial structure as well as the geometric embeddings of the graphs into account. We present a general algorithmic approach for computing these graph distances. Although we show that deciding the distances is NP-hard for general embedded graphs, we prove that our approach yields polynomial time algorithms if the graphs are trees, and for the distance based on the weak Fr\'echet distance if the graphs are planar embedded. Moreover, we prove that deciding the distances based on the Fr\'echet distance remains NP-hard for planar embedded graphs and show how our general algorithmic approach yields an exponential time algorithm and a polynomial time approximation algorithm for this case.Comment: 27 pages, 14 Figure

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Geometry-Aware Adaptation for Pretrained Models

Author: Adila Dyah
Cromp Sonia
Huang Tzu-Heng
Li Xintong
Roberts Nicholas
Sala Frederic
Zhao Jitian
Publication venue
Publication date: 23/07/2023
Field of study

Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes -- or, in the case of zero-shot prediction, to improve its performance -- without any additional training. Our technique is a drop-in replacement of the standard prediction rule, swapping argmax with the Fr\'echet mean. We provide a comprehensive theoretical analysis for this approach, studying (i) learning-theoretic results trading off label space diameter, sample complexity, and model dimension, (ii) characterizations of the full range of scenarios in which it is possible to predict any unobserved class, and (iii) an optimal active learning-like next class selection procedure to obtain optimal training classes for when it is not possible to predict the entire range of unobserved classes. Empirically, using easily-available external metrics, our proposed approach, Loki, gains up to 29.7% relative improvement over SimCLR on ImageNet and scales to hundreds of thousands of classes. When no such metric is available, Loki can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pretrained zero-shot models such as CLIP

arXiv.org e-Print Archive

F?D: On understanding the role of deep feature spaces on face generation evaluation

Author: Balakrishnan Guha
Kabra Krish
Publication venue
Publication date: 31/05/2023
Field of study

Perceptual metrics, like the Fr\'echet Inception Distance (FID), are widely used to assess the similarity between synthetically generated and ground truth (real) images. The key idea behind these metrics is to compute errors in a deep feature space that captures perceptually and semantically rich image features. Despite their popularity, the effect that different deep features and their design choices have on a perceptual metric has not been well studied. In this work, we perform a causal analysis linking differences in semantic attributes and distortions between face image distributions to Fr\'echet distances (FD) using several popular deep feature spaces. A key component of our analysis is the creation of synthetic counterfactual faces using deep face generators. Our experiments show that the FD is heavily influenced by its feature space's training dataset and objective function. For example, FD using features extracted from ImageNet-trained models heavily emphasize hats over regions like the eyes and mouth. Moreover, FD using features from a face gender classifier emphasize hair length more than distances in an identity (recognition) feature space. Finally, we evaluate several popular face generation models across feature spaces and find that StyleGAN2 consistently ranks higher than other face generators, except with respect to identity (recognition) features. This suggests the need for considering multiple feature spaces when evaluating generative models and using feature spaces that are tuned to nuances of the domain of interest.Comment: Code and dataset to be released soo

arXiv.org e-Print Archive

Approximating Sparsest Cut in Low Rank Graphs via Embeddings from Approximately Low Dimensional Spaces

Author: Rabani Yuval
Venkat Rakesh
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2017)
Publication date: 01/01/2017
Field of study

We consider the problem of embedding a finite set of points x_1, ...x_n in R^d that satisfy l_2^2 triangle inequalities into l_1, when the points are approximately low-dimensional. Goemans (unpublished, appears in a work of Magen and Moharammi (2008) ) showed that such points residing in exactly d dimensions can be embedded into l_1 with distortion at most sqrt{d}. We prove the following robust analogue of this statement: if there exists a r-dimensional subspace Pi such that the projections onto this subspace satisfy sum_{i,j in [n]} norm{Pi x_i - Pi x_j}_2^2 >= Omega(1) * sum_{i,j in [n]} norm{x_i - x_j}_2^2, then there is an embedding of the points into l_1 with O(sqrt{r}) average distortion. A consequence of this result is that the integrality gap of the well-known Goemans-Linial SDP relaxation for the Uniform Sparsest Cut problem is O(sqrt{r}) on graphs G whose r-th smallest normalized eigenvalue of the Laplacian satisfies lambda_r(G)/n >= Omega(1)*Phi_{SDP}(G). Our result improves upon the previously known bound of O(r) on the average distortion, and the integrality gap of the Goemans-Linial SDP under the same preconditions, proven in [Deshpande and Venkat, 2014], and [Deshpande, Harsha and Venkat 2016]

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server