13,015 research outputs found

    Quantifying Privacy: A Novel Entropy-Based Measure of Disclosure Risk

    Full text link
    It is well recognised that data mining and statistical analysis pose a serious treat to privacy. This is true for financial, medical, criminal and marketing research. Numerous techniques have been proposed to protect privacy, including restriction and data modification. Recently proposed privacy models such as differential privacy and k-anonymity received a lot of attention and for the latter there are now several improvements of the original scheme, each removing some security shortcomings of the previous one. However, the challenge lies in evaluating and comparing privacy provided by various techniques. In this paper we propose a novel entropy based security measure that can be applied to any generalisation, restriction or data modification technique. We use our measure to empirically evaluate and compare a few popular methods, namely query restriction, sampling and noise addition.Comment: 20 pages, 4 figure

    Algorithms for the continuous nonlinear resource allocation problem---new implementations and numerical studies

    Full text link
    Patriksson (2008) provided a then up-to-date survey on the continuous,separable, differentiable and convex resource allocation problem with a single resource constraint. Since the publication of that paper the interest in the problem has grown: several new applications have arisen where the problem at hand constitutes a subproblem, and several new algorithms have been developed for its efficient solution. This paper therefore serves three purposes. First, it provides an up-to-date extension of the survey of the literature of the field, complementing the survey in Patriksson (2008) with more then 20 books and articles. Second, it contributes improvements of some of these algorithms, in particular with an improvement of the pegging (that is, variable fixing) process in the relaxation algorithm, and an improved means to evaluate subsolutions. Third, it numerically evaluates several relaxation (primal) and breakpoint (dual) algorithms, incorporating a variety of pegging strategies, as well as a quasi-Newton method. Our conclusion is that our modification of the relaxation algorithm performs the best. At least for problem sizes up to 30 million variables the practical time complexity for the breakpoint and relaxation algorithms is linear

    Minimum and maximum entropy distributions for binary systems with known means and pairwise correlations

    Full text link
    Maximum entropy models are increasingly being used to describe the collective activity of neural populations with measured mean neural activities and pairwise correlations, but the full space of probability distributions consistent with these constraints has not been explored. We provide upper and lower bounds on the entropy for the {\em minimum} entropy distribution over arbitrarily large collections of binary units with any fixed set of mean values and pairwise correlations. We also construct specific low-entropy distributions for several relevant cases. Surprisingly, the minimum entropy solution has entropy scaling logarithmically with system size for any set of first- and second-order statistics consistent with arbitrarily large systems. We further demonstrate that some sets of these low-order statistics can only be realized by small systems. Our results show how only small amounts of randomness are needed to mimic low-order statistical properties of highly entropic distributions, and we discuss some applications for engineered and biological information transmission systems.Comment: 34 pages, 7 figure
    corecore