518 research outputs found
Understanding Compressive Adversarial Privacy
Designing a data sharing mechanism without sacrificing too much privacy can
be considered as a game between data holders and malicious attackers. This
paper describes a compressive adversarial privacy framework that captures the
trade-off between the data privacy and utility. We characterize the optimal
data releasing mechanism through convex optimization when assuming that both
the data holder and attacker can only modify the data using linear
transformations. We then build a more realistic data releasing mechanism that
can rely on a nonlinear compression model while the attacker uses a neural
network. We demonstrate in a series of empirical applications that this
framework, consisting of compressive adversarial privacy, can preserve
sensitive information
Towards Efficient Data Valuation Based on the Shapley Value
"How much is my data worth?" is an increasingly common question posed by
organizations and individuals alike. An answer to this question could allow,
for instance, fairly distributing profits among multiple data contributors and
determining prospective compensation when data breaches happen. In this paper,
we study the problem of data valuation by utilizing the Shapley value, a
popular notion of value which originated in coopoerative game theory. The
Shapley value defines a unique payoff scheme that satisfies many desiderata for
the notion of data value. However, the Shapley value often requires exponential
time to compute. To meet this challenge, we propose a repertoire of efficient
algorithms for approximating the Shapley value. We also demonstrate the value
of each training instance for various benchmark datasets
On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects
The Internet of Things (IoT) will be a main data generation infrastructure
for achieving better system intelligence. This paper considers the design and
implementation of a practical privacy-preserving collaborative learning scheme,
in which a curious learning coordinator trains a better machine learning model
based on the data samples contributed by a number of IoT objects, while the
confidentiality of the raw forms of the training data is protected against the
coordinator. Existing distributed machine learning and data encryption
approaches incur significant computation and communication overhead, rendering
them ill-suited for resource-constrained IoT objects. We study an approach that
applies independent Gaussian random projection at each IoT object to obfuscate
data and trains a deep neural network at the coordinator based on the projected
data from the IoT objects. This approach introduces light computation overhead
to the IoT objects and moves most workload to the coordinator that can have
sufficient computing resources. Although the independent projections performed
by the IoT objects address the potential collusion between the curious
coordinator and some compromised IoT objects, they significantly increase the
complexity of the projected data. In this paper, we leverage the superior
learning capability of deep learning in capturing sophisticated patterns to
maintain good learning performance. Extensive comparative evaluation shows that
this approach outperforms other lightweight approaches that apply additive
noisification for differential privacy and/or support vector machines for
learning in the applications with light data pattern complexities.Comment: 12 pages,IOTDI 201
Recommended from our members
TOWARDS RELIABLE CIRCUMVENTION OF INTERNET CENSORSHIP
The Internet plays a crucial role in today\u27s social and political movements by facilitating the free circulation of speech, information, and ideas; democracy and human rights throughout the world critically depend on preserving and bolstering the Internet\u27s openness. Consequently, repressive regimes, totalitarian governments, and corrupt corporations regulate, monitor, and restrict the access to the Internet, which is broadly known as Internet \emph{censorship}. Most countries are improving the internet infrastructures, as a result they can implement more advanced censoring techniques. Also with the advancements in the application of machine learning techniques for network traffic analysis have enabled the more sophisticated Internet censorship. In this thesis, We take a close look at the main pillars of internet censorship, we will introduce new defense and attacks in the internet censorship literature.
Internet censorship techniques investigate users’ communications and they can decide to interrupt a connection to prevent a user from communicating with a specific entity. Traffic analysis is one of the main techniques used to infer information from internet communications. One of the major challenges to traffic analysis mechanisms is scaling the techniques to today\u27s exploding volumes of network traffic, i.e., they impose high storage, communications, and computation overheads. We aim at addressing this scalability issue by introducing a new direction for traffic analysis, which we call \emph{compressive traffic analysis}. Moreover, we show that, unfortunately, traffic analysis attacks can be conducted on Anonymity systems with drastically higher accuracies than before by leveraging emerging learning mechanisms. We particularly design a system, called \deepcorr, that outperforms the state-of-the-art by significant margins in correlating network connections. \deepcorr leverages an advanced deep learning architecture to \emph{learn} a flow correlation function tailored to complex networks. Also to be able to analyze the weakness of such approaches we show that an adversary can defeat deep neural network based traffic analysis techniques by applying statistically undetectable \emph{adversarial perturbations} on the patterns of live network traffic.
We also design techniques to circumvent internet censorship. Decoy routing is an emerging approach for censorship circumvention in which circumvention is implemented with help from a number of volunteer Internet autonomous systems, called decoy ASes. We propose a new architecture for decoy routing that, by design, is significantly stronger to rerouting attacks compared to \emph{all} previous designs. Unlike previous designs, our new architecture operates decoy routers only on the downstream traffic of the censored users; therefore we call it \emph{downstream-only} decoy routing. As we demonstrate through Internet-scale BGP simulations, downstream-only decoy routing offers significantly stronger resistance to rerouting attacks, which is intuitively because a (censoring) ISP has much less control on the downstream BGP routes of its traffic. Then, we propose to use game theoretic approaches to model the arms races between the censors and the censorship circumvention tools. This will allow us to analyze the effect of different parameters or censoring behaviors on the performance of censorship circumvention tools. We apply our methods on two fundamental problems in internet censorship.
Finally, to bring our ideas to practice, we designed a new censorship circumvention tool called \name. \name aims at increasing the collateral damage of censorship by employing a ``mass\u27\u27 of normal Internet users, from both censored and uncensored areas, to serve as circumvention proxies
Detecting the Unexpected via Image Resynthesis
Classical semantic segmentation methods, including the recent deep learning
ones, assume that all classes observed at test time have been seen during
training. In this paper, we tackle the more realistic scenario where unexpected
objects of unknown classes can appear at test time. The main trends in this
area either leverage the notion of prediction uncertainty to flag the regions
with low confidence as unknown, or rely on autoencoders and highlight
poorly-decoded regions. Having observed that, in both cases, the detected
regions typically do not correspond to unexpected objects, in this paper, we
introduce a drastically different strategy: It relies on the intuition that the
network will produce spurious labels in regions depicting unexpected objects.
Therefore, resynthesizing the image from the resulting semantic map will yield
significant appearance differences with respect to the input image. In other
words, we translate the problem of detecting unknown classes to one of
identifying poorly-resynthesized image regions. We show that this outperforms
both uncertainty- and autoencoder-based methods
- …