13,015 research outputs found
Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure Risk
It is well recognised that data mining and statistical analysis pose a
serious treat to privacy. This is true for financial, medical, criminal and
marketing research. Numerous techniques have been proposed to protect privacy,
including restriction and data modification. Recently proposed privacy models
such as differential privacy and k-anonymity received a lot of attention and
for the latter there are now several improvements of the original scheme, each
removing some security shortcomings of the previous one. However, the challenge
lies in evaluating and comparing privacy provided by various techniques. In
this paper we propose a novel entropy based security measure that can be
applied to any generalisation, restriction or data modification technique. We
use our measure to empirically evaluate and compare a few popular methods,
namely query restriction, sampling and noise addition.Comment: 20 pages, 4 figure
Algorithms for the continuous nonlinear resource allocation problem---new implementations and numerical studies
Patriksson (2008) provided a then up-to-date survey on the
continuous,separable, differentiable and convex resource allocation problem
with a single resource constraint. Since the publication of that paper the
interest in the problem has grown: several new applications have arisen where
the problem at hand constitutes a subproblem, and several new algorithms have
been developed for its efficient solution. This paper therefore serves three
purposes. First, it provides an up-to-date extension of the survey of the
literature of the field, complementing the survey in Patriksson (2008) with
more then 20 books and articles. Second, it contributes improvements of some of
these algorithms, in particular with an improvement of the pegging (that is,
variable fixing) process in the relaxation algorithm, and an improved means to
evaluate subsolutions. Third, it numerically evaluates several relaxation
(primal) and breakpoint (dual) algorithms, incorporating a variety of pegging
strategies, as well as a quasi-Newton method. Our conclusion is that our
modification of the relaxation algorithm performs the best. At least for
problem sizes up to 30 million variables the practical time complexity for the
breakpoint and relaxation algorithms is linear
Minimum and maximum entropy distributions for binary systems with known means and pairwise correlations
Maximum entropy models are increasingly being used to describe the collective
activity of neural populations with measured mean neural activities and
pairwise correlations, but the full space of probability distributions
consistent with these constraints has not been explored. We provide upper and
lower bounds on the entropy for the {\em minimum} entropy distribution over
arbitrarily large collections of binary units with any fixed set of mean values
and pairwise correlations. We also construct specific low-entropy distributions
for several relevant cases. Surprisingly, the minimum entropy solution has
entropy scaling logarithmically with system size for any set of first- and
second-order statistics consistent with arbitrarily large systems. We further
demonstrate that some sets of these low-order statistics can only be realized
by small systems. Our results show how only small amounts of randomness are
needed to mimic low-order statistical properties of highly entropic
distributions, and we discuss some applications for engineered and biological
information transmission systems.Comment: 34 pages, 7 figure
- …