research

Max-sum diversity via convex programming

Abstract

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function f(⋅)f(\cdot) that measures diversity of a subset, the task is to select a feasible subset SS such that f(S)f(S) is maximized. The \emph{sum-dispersion} function f(S)=∑x,y∈Sd(x,y)f(S) = \sum_{x,y \in S} d(x,y), which is the sum of the pairwise distances in SS, is in this context a prominent diversification measure. The corresponding diversity maximization is the \emph{max-sum} or \emph{sum-sum diversification}. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversification problem under a matroid constraint for distances d(⋅,⋅)d(\cdot,\cdot) of \emph{negative type}. Distances of negative type are, for example, metric distances stemming from the ℓ2\ell_2 and ℓ1\ell_1 norm, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search

    Similar works