14 research outputs found

    Efficient, noise-tolerant, and private learning via boosting

    Full text link
    We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise.https://arxiv.org/pdf/2002.01100.pd

    On the Complexity of Simulating Auxiliary Input

    Get PDF
    We construct a simulator for the simulating auxiliary input problem with complexity better than all previous results and prove the optimality up to logarithmic factors by establishing a black-box lower bound. Specifically, let ℓ\ell be the length of the auxiliary input and ϵ\epsilon be the indistinguishability parameter. Our simulator is O~(2ℓϵ−2)\tilde{O}(2^{\ell}\epsilon^{-2}) more complicated than the distinguisher family. For the lower bound, we show the relative complexity to the distinguisher of a simulator is at least Ω(2ℓϵ−2)\Omega(2^{\ell}\epsilon^{-2}) assuming the simulator is restricted to use the distinguishers in a black-box way and satisfy a mild restriction

    Interactive video retrieval using implicit user feedback.

    Get PDF
    PhDIn the recent years, the rapid development of digital technologies and the low cost of recording media have led to a great increase in the availability of multimedia content worldwide. This availability places the demand for the development of advanced search engines. Traditionally, manual annotation of video was one of the usual practices to support retrieval. However, the vast amounts of multimedia content make such practices very expensive in terms of human effort. At the same time, the availability of low cost wearable sensors delivers a plethora of user-machine interaction data. Therefore, there is an important challenge of exploiting implicit user feedback (such as user navigation patterns and eye movements) during interactive multimedia retrieval sessions with a view to improving video search engines. In this thesis, we focus on automatically annotating video content by exploiting aggregated implicit feedback of past users expressed as click-through data and gaze movements. Towards this goal, we have conducted interactive video retrieval experiments, in order to collect click-through and eye movement data in not strictly controlled environments. First, we generate semantic relations between the multimedia items by proposing a graph representation of aggregated past interaction data and exploit them to generate recommendations, as well as to improve content-based search. Then, we investigate the role of user gaze movements in interactive video retrieval and propose a methodology for inferring user interest by employing support vector machines and gaze movement-based features. Finally, we propose an automatic video annotation framework, which combines query clustering into topics by constructing gaze movement-driven random forests and temporally enhanced dominant sets, as well as video shot classification for predicting the relevance of viewed items with respect to a topic. The results show that exploiting heterogeneous implicit feedback from past users is of added value for future users of interactive video retrieval systems

    Data Privacy Beyond Differential Privacy

    Get PDF
    Computing technologies today have made it much easier to gather personal data, ranging from GPS locations to medical records, from online behavior to social exchanges. As algorithms are constantly analyzing such detailed personal information for a wide range of computations, data privacy emerges as a paramount concern. As a strong, meaningful and rigorous notion of privacy, Differential Privacy has provided a powerful framework for designing data analysis algorithms with provable privacy guarantees. Over the past decade, there has been tremendous progress in the theory and algorithms for differential privacy, most of which consider the setting of centralized computation where a single, static database is subject to many data analyses. However, this standard framework does not capture many complex issues in modern computation. For example, the data might be distributed across self-interested agents, who may have incentive to misreport their data; and different individuals in the computation may have different expectations to privacy. The goal of this dissertation is to bring the rich theory of differential privacy to several computational problems in practice. We start by studying the problem of private counting query release for high-dimensional data, for which there are well-known computational hardness results. Despite the worst-case intractability barrier, we provide a solution with practical empirical performances by leveraging powerful optimization heuristics. Then we tackle problems within different social and economic settings, where the standard notion of differential privacy is not applicable. To that end, we use the perspective of differential privacy to design algorithms with meaningful privacy guarantees. (1) We provide privacy-preserving algorithms for solving a family of economic optimization problems under a strong relaxation of the standard definition of differential privacy---joint differential privacy. (2) We also show that (joint) differential privacy can serve as a novel tool for mechanism design when solving these optimization problems: Under our private mechanisms, the agents are incentivized to behave truthfully. (3) Finally, we consider the problem of using social network metadata to guide a search for some class of targeted individuals (for whom we cannot provide any meaningful privacy guarantees). We give a new variant of differential privacy---protected differential privacy---that guarantees differential privacy only for a subgroup of protected individuals. Under this privacy notion, we provide a family of algorithms for searching targeted individuals in the network while ensuring the privacy for the protected (un-targeted) ones
    corecore