29,578 research outputs found
Communication Steps for Parallel Query Processing
We consider the problem of computing a relational query on a large input
database of size , using a large number of servers. The computation is
performed in rounds, and each server can receive only
bits of data, where is a parameter that controls
replication. We examine how many global communication steps are needed to
compute . We establish both lower and upper bounds, in two settings. For a
single round of communication, we give lower bounds in the strongest possible
model, where arbitrary bits may be exchanged; we show that any algorithm
requires , where is the fractional vertex
cover of the hypergraph of . We also give an algorithm that matches the
lower bound for a specific class of databases. For multiple rounds of
communication, we present lower bounds in a model where routing decisions for a
tuple are tuple-based. We show that for the class of tree-like queries there
exists a tradeoff between the number of rounds and the space exponent
. The lower bounds for multiple rounds are the first of their
kind. Our results also imply that transitive closure cannot be computed in O(1)
rounds of communication
XYZ Privacy
Future autonomous vehicles will generate, collect, aggregate and consume
significant volumes of data as key gateway devices in emerging Internet of
Things scenarios. While vehicles are widely accepted as one of the most
challenging mobility contexts in which to achieve effective data
communications, less attention has been paid to the privacy of data emerging
from these vehicles. The quality and usability of such privatized data will lie
at the heart of future safe and efficient transportation solutions.
In this paper, we present the XYZ Privacy mechanism. XYZ Privacy is to our
knowledge the first such mechanism that enables data creators to submit
multiple contradictory responses to a query, whilst preserving utility measured
as the absolute error from the actual original data. The functionalities are
achieved in both a scalable and secure fashion. For instance, individual
location data can be obfuscated while preserving utility, thereby enabling the
scheme to transparently integrate with existing systems (e.g. Waze). A new
cryptographic primitive Function Secret Sharing is used to achieve
non-attributable writes and we show an order of magnitude improvement from the
default implementation.Comment: arXiv admin note: text overlap with arXiv:1708.0188
What Can We Learn Privately?
Learning problems form an important category of computational tasks that
generalizes many of the computations researchers apply to large real-life data
sets. We ask: what concept classes can be learned privately, namely, by an
algorithm whose output does not depend too heavily on any one input or specific
training example? More precisely, we investigate learning algorithms that
satisfy differential privacy, a notion that provides strong confidentiality
guarantees in contexts where aggregate information is released about a database
containing sensitive information about individuals. We demonstrate that,
ignoring computational constraints, it is possible to privately agnostically
learn any concept class using a sample size approximately logarithmic in the
cardinality of the concept class. Therefore, almost anything learnable is
learnable privately: specifically, if a concept class is learnable by a
(non-private) algorithm with polynomial sample complexity and output size, then
it can be learned privately using a polynomial number of samples. We also
present a computationally efficient private PAC learner for the class of parity
functions. Local (or randomized response) algorithms are a practical class of
private algorithms that have received extensive investigation. We provide a
precise characterization of local private learning algorithms. We show that a
concept class is learnable by a local algorithm if and only if it is learnable
in the statistical query (SQ) model. Finally, we present a separation between
the power of interactive and noninteractive local learning algorithms.Comment: 35 pages, 2 figure
Statistical Active Learning Algorithms for Noise Tolerance and Differential Privacy
We describe a framework for designing efficient active learning algorithms
that are tolerant to random classification noise and are
differentially-private. The framework is based on active learning algorithms
that are statistical in the sense that they rely on estimates of expectations
of functions of filtered random examples. It builds on the powerful statistical
query framework of Kearns (1993).
We show that any efficient active statistical learning algorithm can be
automatically converted to an efficient active learning algorithm which is
tolerant to random classification noise as well as other forms of
"uncorrelated" noise. The complexity of the resulting algorithms has
information-theoretically optimal quadratic dependence on , where
is the noise rate.
We show that commonly studied concept classes including thresholds,
rectangles, and linear separators can be efficiently actively learned in our
framework. These results combined with our generic conversion lead to the first
computationally-efficient algorithms for actively learning some of these
concept classes in the presence of random classification noise that provide
exponential improvement in the dependence on the error over their
passive counterparts. In addition, we show that our algorithms can be
automatically converted to efficient active differentially-private algorithms.
This leads to the first differentially-private active learning algorithms with
exponential label savings over the passive case.Comment: Extended abstract appears in NIPS 201
Improved Lower Bounds for Locally Decodable Codes and Private Information Retrieval
We prove new lower bounds for locally decodable codes and private information
retrieval. We show that a 2-query LDC encoding n-bit strings over an l-bit
alphabet, where the decoder only uses b bits of each queried position of the
codeword, needs code length m = exp(Omega(n/(2^b Sum_{i=0}^b {l choose i})))
Similarly, a 2-server PIR scheme with an n-bit database and t-bit queries,
where the user only needs b bits from each of the two l-bit answers, unknown to
the servers, satisfies t = Omega(n/(2^b Sum_{i=0}^b {l choose i})). This
implies that several known PIR schemes are close to optimal. Our results
generalize those of Goldreich et al. who proved roughly the same bounds for
linear LDCs and PIRs. Like earlier work by Kerenidis and de Wolf, our classical
lower bounds are proved using quantum computational techniques. In particular,
we give a tight analysis of how well a 2-input function can be computed from a
quantum superposition of both inputs.Comment: 12 pages LaTeX, To appear in ICALP '0
Shortest Path Computation with No Information Leakage
Shortest path computation is one of the most common queries in location-based
services (LBSs). Although particularly useful, such queries raise serious
privacy concerns. Exposing to a (potentially untrusted) LBS the client's
position and her destination may reveal personal information, such as social
habits, health condition, shopping preferences, lifestyle choices, etc. The
only existing method for privacy-preserving shortest path computation follows
the obfuscation paradigm; it prevents the LBS from inferring the source and
destination of the query with a probability higher than a threshold. This
implies, however, that the LBS still deduces some information (albeit not
exact) about the client's location and her destination. In this paper we aim at
strong privacy, where the adversary learns nothing about the shortest path
query. We achieve this via established private information retrieval
techniques, which we treat as black-box building blocks. Experiments on real,
large-scale road networks assess the practicality of our schemes.Comment: VLDB201
- …