35,678 research outputs found
Privacy-Friendly Mobility Analytics using Aggregate Location Data
Location data can be extremely useful to study commuting patterns and
disruptions, as well as to predict real-time traffic volumes. At the same time,
however, the fine-grained collection of user locations raises serious privacy
concerns, as this can reveal sensitive information about the users, such as,
life style, political and religious inclinations, or even identities. In this
paper, we study the feasibility of crowd-sourced mobility analytics over
aggregate location information: users periodically report their location, using
a privacy-preserving aggregation protocol, so that the server can only recover
aggregates -- i.e., how many, but not which, users are in a region at a given
time. We experiment with real-world mobility datasets obtained from the
Transport For London authority and the San Francisco Cabs network, and present
a novel methodology based on time series modeling that is geared to forecast
traffic volumes in regions of interest and to detect mobility anomalies in
them. In the presence of anomalies, we also make enhanced traffic volume
predictions by feeding our model with additional information from correlated
regions. Finally, we present and evaluate a mobile app prototype, called
Mobility Data Donors (MDD), in terms of computation, communication, and energy
overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201
Trade Privacy for Utility: A Learning-Based Privacy Pricing Game in Federated Learning
To prevent implicit privacy disclosure in sharing gradients among data owners
(DOs) under federated learning (FL), differential privacy (DP) and its variants
have become a common practice to offer formal privacy guarantees with low
overheads. However, individual DOs generally tend to inject larger DP noises
for stronger privacy provisions (which entails severe degradation of model
utility), while the curator (i.e., aggregation server) aims to minimize the
overall effect of added random noises for satisfactory model performance. To
address this conflicting goal, we propose a novel dynamic privacy pricing
(DyPP) game which allows DOs to sell individual privacy (by lowering the scale
of locally added DP noise) for differentiated economic compensations (offered
by the curator), thereby enhancing FL model utility. Considering
multi-dimensional information asymmetry among players (e.g., DO's data
distribution and privacy preference, and curator's maximum affordable payment)
as well as their varying private information in distinct FL tasks, it is hard
to directly attain the Nash equilibrium of the mixed-strategy DyPP game.
Alternatively, we devise a fast reinforcement learning algorithm with two
layers to quickly learn the optimal mixed noise-saving strategy of DOs and the
optimal mixed pricing strategy of the curator without prior knowledge of
players' private information. Experiments on real datasets validate the
feasibility and effectiveness of the proposed scheme in terms of faster
convergence speed and enhanced FL model utility with lower payment costs.Comment: Accepted by IEEE ICC202
Federated Learning Framework Coping with Hierarchical Heterogeneity in Cooperative ITS
In this paper, we introduce a federated learning framework coping with
Hierarchical Heterogeneity (H2-Fed), which can notably enhance the conventional
pre-trained deep learning model. The framework exploits data from connected
public traffic agents in vehicular networks without affecting user data
privacy. By coordinating existing traffic infrastructure, including roadside
units and road traffic clouds, the model parameters are efficiently
disseminated by vehicular communications and hierarchically aggregated.
Considering the individual heterogeneity of data distribution, computational
and communication capabilities across traffic agents and roadside units, we
employ a novel method that addresses the heterogeneity of different aggregation
layers of the framework architecture, i.e., aggregation in layers of roadside
units and cloud. The experiment results indicate that our method can well
balance the learning accuracy and stability according to the knowledge of
heterogeneity in current communication networks. Compared to other baseline
approaches, the evaluation on a Non-IID MNIST dataset shows that our framework
is more general and capable especially in application scenarios with low
communication quality. Even when 90% of the agents are timely disconnected, the
pre-trained deep learning model can still be forced to converge stably, and its
accuracy can be enhanced from 68% to over 90% after convergence
User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy
Recommender systems have become an integral part of many social networks and
extract knowledge from a user's personal and sensitive data both explicitly,
with the user's knowledge, and implicitly. This trend has created major privacy
concerns as users are mostly unaware of what data and how much data is being
used and how securely it is used. In this context, several works have been done
to address privacy concerns for usage in online social network data and by
recommender systems. This paper surveys the main privacy concerns, measurements
and privacy-preserving techniques used in large-scale online social networks
and recommender systems. It is based on historical works on security,
privacy-preserving, statistical modeling, and datasets to provide an overview
of the technical difficulties and problems associated with privacy preserving
in online social networks.Comment: 26 pages, IET book chapter on big data recommender system
- …