2,388 research outputs found
Addressing practical challenges of Bayesian optimisation
This thesis focuses on addressing several challenges in applying Bayesian optimisation in real world problems. The contributions of this thesis are new Bayesian optimisation algorithms for three practical problems: finding stable solutions, optimising cascaded processes and privacy-aware optimisation
Privacy-Friendly Mobility Analytics using Aggregate Location Data
Location data can be extremely useful to study commuting patterns and
disruptions, as well as to predict real-time traffic volumes. At the same time,
however, the fine-grained collection of user locations raises serious privacy
concerns, as this can reveal sensitive information about the users, such as,
life style, political and religious inclinations, or even identities. In this
paper, we study the feasibility of crowd-sourced mobility analytics over
aggregate location information: users periodically report their location, using
a privacy-preserving aggregation protocol, so that the server can only recover
aggregates -- i.e., how many, but not which, users are in a region at a given
time. We experiment with real-world mobility datasets obtained from the
Transport For London authority and the San Francisco Cabs network, and present
a novel methodology based on time series modeling that is geared to forecast
traffic volumes in regions of interest and to detect mobility anomalies in
them. In the presence of anomalies, we also make enhanced traffic volume
predictions by feeding our model with additional information from correlated
regions. Finally, we present and evaluate a mobile app prototype, called
Mobility Data Donors (MDD), in terms of computation, communication, and energy
overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201
Modeling, Predicting and Capturing Human Mobility
Realistic models of human mobility are critical for modern day applications, specifically for recommendation systems, resource planning and process optimization domains. Given the rapid proliferation of mobile devices equipped with Internet connectivity and GPS functionality today, aggregating large sums of individual geolocation data is feasible. The thesis focuses on methodologies to facilitate data-driven mobility modeling by drawing parallels between the inherent nature of mobility trajectories, statistical physics and information theory. On the applied side, the thesis contributions lie in leveraging the formulated mobility models to construct prediction workflows by adopting a privacy-by-design perspective. This enables end users to derive utility from location-based services while preserving their location privacy. Finally, the thesis presents several approaches to generate large-scale synthetic mobility datasets by applying machine learning approaches to facilitate experimental reproducibility
A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings
We propose a Neighbourhood-Aware Differential Privacy (NADP) mechanism
considering the neighbourhood of a word in a pretrained static word embedding
space to determine the minimal amount of noise required to guarantee a
specified privacy level. We first construct a nearest neighbour graph over the
words using their embeddings, and factorise it into a set of connected
components (i.e. neighbourhoods). We then separately apply different levels of
Gaussian noise to the words in each neighbourhood, determined by the set of
words in that neighbourhood. Experiments show that our proposed NADP mechanism
consistently outperforms multiple previously proposed DP mechanisms such as
Laplacian, Gaussian, and Mahalanobis in multiple downstream tasks, while
guaranteeing higher levels of privacy.Comment: Accepted to IJCNLP-AACL 202
What Does The Crowd Say About You? Evaluating Aggregation-based Location Privacy
Information about people’s movements and the
locations they visit enables an increasing number of mobility
analytics applications, e.g., in the context of urban and transportation
planning, In this setting, rather than collecting or
sharing raw data, entities often use aggregation as a privacy
protection mechanism, aiming to hide individual users’ location
traces. Furthermore, to bound information leakage from
the aggregates, they can perturb the input of the aggregation
or its output to ensure that these are differentially private.
In this paper, we set to evaluate the impact of releasing aggregate
location time-series on the privacy of individuals contributing
to the aggregation. We introduce a framework allowing
us to reason about privacy against an adversary attempting
to predict users’ locations or recover their mobility patterns.
We formalize these attacks as inference problems, and
discuss a few strategies to model the adversary’s prior knowledge
based on the information she may have access to. We
then use the framework to quantify the privacy loss stemming
from aggregate location data, with and without the protection
of differential privacy, using two real-world mobility datasets.
We find that aggregates do leak information about individuals’
punctual locations and mobility profiles. The density of
the observations, as well as timing, play important roles, e.g.,
regular patterns during peak hours are better protected than
sporadic movements. Finally, our evaluation shows that both
output and input perturbation offer little additional protection,
unless they introduce large amounts of noise ultimately destroying
the utility of the data
- …