Search CORE

6,131 research outputs found

On the Complexity of $t$ -Closeness Anonymization and Related Problems

Author: D. Rebollo-Monedero
E. Anshelevich
J. Blocki
J. Cao
L. Sweeney
N. Li
P. Bonizzoni
P. Samarati
P.A. Evans
R. Bredereck
Y. Rubner
Publication venue
Publication date: 01/01/2013
Field of study

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as

k

-anonymity and

l

-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The

t

-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to

k

-anonymity and

l

-diversity, the theoretical aspect of

t

-closeness has not been well investigated. We initiate the first systematic theoretical study on the

t

-closeness principle under the commonly-used attribute suppression model. We prove that for every constant

t

such that

0\leq t<1

, it is NP-hard to find an optimal

t

-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of

t

, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of

k

-anonymity and

l

-diversity left in the literature.Comment: An extended abstract to appear in DASFAA 201

arXiv.org e-Print Archive

Crossref

Feature-Based Diversity Optimization for Problem Instance Classification

Author: Gao Wanru
Nallaperuma Samadhi
Neumann Frank
Publication venue
Publication date: 29/05/2020
Field of study

Understanding the behaviour of heuristic search methods is a challenge. This even holds for simple local search methods such as 2-OPT for the Traveling Salesperson problem. In this paper, we present a general framework that is able to construct a diverse set of instances that are hard or easy for a given search heuristic. Such a diverse set is obtained by using an evolutionary algorithm for constructing hard or easy instances that are diverse with respect to different features of the underlying problem. Examining the constructed instance sets, we show that many combinations of two or three features give a good classification of the TSP instances in terms of whether they are hard to be solved by 2-OPT.Comment: 20 pages, 18 figure

arXiv.org e-Print Archive

Adelaide Research & Scholarship

Clustering with diversity

Author: Li Jian
Yi Ke
Zhang Qin
Publication venue
Publication date: 01/01/2010
Field of study

We consider the {\em clustering with diversity} problem: given a set of colored points in a metric space, partition them into clusters such that each cluster has at least

\ell

points, all of which have distinct colors. We give a 2-approximation to this problem for any

\ell

when the objective is to minimize the maximum radius of any cluster. We show that the approximation ratio is optimal unless

\mathbf{P=NP}

, by providing a matching lower bound. Several extensions to our algorithm have also been developed for handling outliers. This problem is mainly motivated by applications in privacy-preserving data publication.Comment: Extended abstract accepted in ICALP 2010. Keywords: Approximation algorithm, k-center, k-anonymity, l-diversit

arXiv.org e-Print Archive

CiteSeerX

Hong Kong University of Science and Technology Institutional Repository

Multiwinner Voting with Fairness Constraints

Author: Celis L. Elisa
Huang Lingxiao
Vishnoi Nisheeth K.
Publication venue
Publication date: 18/06/2018
Field of study

Multiwinner voting rules are used to select a small representative subset of candidates or items from a larger set given the preferences of voters. However, if candidates have sensitive attributes such as gender or ethnicity (when selecting a committee), or specified types such as political leaning (when selecting a subset of news items), an algorithm that chooses a subset by optimizing a multiwinner voting rule may be unbalanced in its selection -- it may under or over represent a particular gender or political orientation in the examples above. We introduce an algorithmic framework for multiwinner voting problems when there is an additional requirement that the selected subset should be "fair" with respect to a given set of attributes. Our framework provides the flexibility to (1) specify fairness with respect to multiple, non-disjoint attributes (e.g., ethnicity and gender) and (2) specify a score function. We study the computational complexity of this constrained multiwinner voting problem for monotone and submodular score functions and present several approximation algorithms and matching hardness of approximation results for various attribute group structure and types of score functions. We also present simulations that suggest that adding fairness constraints may not affect the scores significantly when compared to the unconstrained case.Comment: The conference version of this paper appears in IJCAI-ECAI 201

arXiv.org e-Print Archive

Crossref

Submodular Optimization with Submodular Cover and Submodular Knapsack Constraints

Author: Bilmes Jeff
Iyer Rishabh
Publication venue
Publication date: 08/11/2013
Field of study

We investigate two new optimization problems -- minimizing a submodular function subject to a submodular lower bound constraint (submodular cover) and maximizing a submodular function subject to a submodular upper bound constraint (submodular knapsack). We are motivated by a number of real-world applications in machine learning including sensor placement and data subset selection, which require maximizing a certain submodular function (like coverage or diversity) while simultaneously minimizing another (like cooperative cost). These problems are often posed as minimizing the difference between submodular functions [14, 35] which is in the worst case inapproximable. We show, however, that by phrasing these problems as constrained optimization, which is more natural for many applications, we achieve a number of bounded approximation guarantees. We also show that both these problems are closely related and an approximation algorithm solving one can be used to obtain an approximation guarantee for the other. We provide hardness results for both problems thus showing that our approximation factors are tight up to log-factors. Finally, we empirically demonstrate the performance and good scalability properties of our algorithms.Comment: 23 pages. A short version of this appeared in Advances of NIPS-201

arXiv.org e-Print Archive

CiteSeerX

Multiwinner Elections with Diversity Constraints

Author: Bredereck Robert
Faliszewski Piotr
Igarashi Ayumi
Lackner Martin
Skowron Piotr
Publication venue
Publication date: 21/11/2017
Field of study

We develop a model of multiwinner elections that combines performance-based measures of the quality of the committee (such as, e.g., Borda scores of the committee members) with diversity constraints. Specifically, we assume that the candidates have certain attributes (such as being a male or a female, being junior or senior, etc.) and the goal is to elect a committee that, on the one hand, has as high a score regarding a given performance measure, but that, on the other hand, meets certain requirements (e.g., of the form "at least

30\%

of the committee members are junior candidates and at least

40\%

are females"). We analyze the computational complexity of computing winning committees in this model, obtaining polynomial-time algorithms (exact and approximate) and NP-hardness results. We focus on several natural classes of voting rules and diversity constraints.Comment: A short version of this paper appears in the proceedings of AAAI-1

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Max-sum diversity via convex programming

Author: Cevallos Alfonso
Eisenbrand Friedrich
Zenklusen Rico
Publication venue
Publication date: 22/11/2015
Field of study

Diversity maximization is an important concept in information retrieval, computational geometry and operations research. Usually, it is a variant of the following problem: Given a ground set, constraints, and a function

f(\cdot)

that measures diversity of a subset, the task is to select a feasible subset

S

such that

f(S)

is maximized. The \emph{sum-dispersion} function

f(S) = \sum_{x,y \in S} d(x,y)

, which is the sum of the pairwise distances in

S

, is in this context a prominent diversification measure. The corresponding diversity maximization is the \emph{max-sum} or \emph{sum-sum diversification}. Many recent results deal with the design of constant-factor approximation algorithms of diversification problems involving sum-dispersion function under a matroid constraint. In this paper, we present a PTAS for the max-sum diversification problem under a matroid constraint for distances

d(\cdot,\cdot)

of \emph{negative type}. Distances of negative type are, for example, metric distances stemming from the

\ell_2

and

\ell_1

norm, as well as the cosine or spherical, or Jaccard distance which are popular similarity metrics in web and image search

arXiv.org e-Print Archive

Repository for Publications and Research Data

Dagstuhl Research Online Publication Server