158 research outputs found
Of Wines and Reviews: Measuring and Modeling the Vivino Wine Social Network
This paper presents an analysis of social experiences around wine consumption
through the lens of Vivino, a social network for wine enthusiasts with over 26
million users worldwide. We compare users' perceptions of various wine types
and regional styles across both New and Old World wines, examining them across
price ranges, vintages, regions, varietals, and blends. Among other things, we
find that ratings provided by Vivino users are not biased by cost. We then
study how wine characteristics, language in wine reviews, and the distribution
of wine ratings can be combined to develop prediction models. More
specifically, we model user behavior to develop a regression model for
predicting wine ratings, and a classifier for determining user review
preferences.Comment: A preliminary version of this paper appears in the Proceedings of the
IEEE/ACM International Conference on Advances in Social Networks Analysis and
Mining (ASONAM 2018). This is the full versio
Danger is My Middle Name: Experimenting with SSL Vulnerabilities in Android Apps
This paper presents a measurement study of information leakage and SSL
vulnerabilities in popular Android apps. We perform static and dynamic analysis
on 100 apps, downloaded at least 10M times, that request full network access.
Our experiments show that, although prior work has drawn a lot of attention to
SSL implementations on mobile platforms, several popular apps (32/100) accept
all certificates and all hostnames, and four actually transmit sensitive data
unencrypted. We set up an experimental testbed simulating man-in-the-middle
attacks and find that many apps (up to 91% when the adversary has a certificate
installed on the victim's device) are vulnerable, allowing the attacker to
access sensitive information, including credentials, files, personal details,
and credit card numbers. Finally, we provide a few recommendations to app
developers and highlight several open research problems.Comment: A preliminary version of this paper appears in the Proceedings of ACM
WiSec 2015. This is the full versio
Privacy-Preserving Genetic Relatedness Test
An increasing number of individuals are turning to Direct-To-Consumer (DTC)
genetic testing to learn about their predisposition to diseases, traits, and/or
ancestry. DTC companies like 23andme and Ancestry.com have started to offer
popular and affordable ancestry and genealogy tests, with services allowing
users to find unknown relatives and long-distant cousins. Naturally, access and
possible dissemination of genetic data prompts serious privacy concerns, thus
motivating the need to design efficient primitives supporting private genetic
tests. In this paper, we present an effective protocol for privacy-preserving
genetic relatedness test (PPGRT), enabling a cloud server to run relatedness
tests on input an encrypted genetic database and a test facility's encrypted
genetic sample. We reduce the test to a data matching problem and perform it,
privately, using searchable encryption. Finally, a performance evaluation of
hamming distance based PP-GRT attests to the practicality of our proposals.Comment: A preliminary version of this paper appears in the Proceedings of the
3rd International Workshop on Genome Privacy and Security (GenoPri'16
Measuring Membership Privacy on Aggregate Location Time-Series
While location data is extremely valuable for various applications,
disclosing it prompts serious threats to individuals' privacy. To limit such
concerns, organizations often provide analysts with aggregate time-series that
indicate, e.g., how many people are in a location at a time interval, rather
than raw individual traces. In this paper, we perform a measurement study to
understand Membership Inference Attacks (MIAs) on aggregate location
time-series, where an adversary tries to infer whether a specific user
contributed to the aggregates.
We find that the volume of contributed data, as well as the regularity and
particularity of users' mobility patterns, play a crucial role in the attack's
success. We experiment with a wide range of defenses based on generalization,
hiding, and perturbation, and evaluate their ability to thwart the attack
vis-a-vis the utility loss they introduce for various mobility analytics tasks.
Our results show that some defenses fail across the board, while others work
for specific tasks on aggregate location time-series. For instance, suppressing
small counts can be used for ranking hotspots, data generalization for
forecasting traffic, hotspot discovery, and map inference, while sampling is
effective for location labeling and anomaly detection when the dataset is
sparse. Differentially private techniques provide reasonable accuracy only in
very specific settings, e.g., discovering hotspots and forecasting their
traffic, and more so when using weaker privacy notions like crowd-blending
privacy. Overall, our measurements show that there does not exist a unique
generic defense that can preserve the utility of the analytics for arbitrary
applications, and provide useful insights regarding the disclosure of sanitized
aggregate location time-series
Systematizing Genome Privacy Research: A Privacy-Enhancing Technologies Perspective
Rapid advances in human genomics are enabling researchers to gain a better
understanding of the role of the genome in our health and well-being,
stimulating hope for more effective and cost efficient healthcare. However,
this also prompts a number of security and privacy concerns stemming from the
distinctive characteristics of genomic data. To address them, a new research
community has emerged and produced a large number of publications and
initiatives.
In this paper, we rely on a structured methodology to contextualize and
provide a critical analysis of the current knowledge on privacy-enhancing
technologies used for testing, storing, and sharing genomic data, using a
representative sample of the work published in the past decade. We identify and
discuss limitations, technical challenges, and issues faced by the community,
focusing in particular on those that are inherently tied to the nature of the
problem and are harder for the community alone to address. Finally, we report
on the importance and difficulty of the identified challenges based on an
online survey of genome data privacy expertsComment: To appear in the Proceedings on Privacy Enhancing Technologies
(PoPETs), Vol. 2019, Issue
Privacy-Friendly Mobility Analytics using Aggregate Location Data
Location data can be extremely useful to study commuting patterns and
disruptions, as well as to predict real-time traffic volumes. At the same time,
however, the fine-grained collection of user locations raises serious privacy
concerns, as this can reveal sensitive information about the users, such as,
life style, political and religious inclinations, or even identities. In this
paper, we study the feasibility of crowd-sourced mobility analytics over
aggregate location information: users periodically report their location, using
a privacy-preserving aggregation protocol, so that the server can only recover
aggregates -- i.e., how many, but not which, users are in a region at a given
time. We experiment with real-world mobility datasets obtained from the
Transport For London authority and the San Francisco Cabs network, and present
a novel methodology based on time series modeling that is geared to forecast
traffic volumes in regions of interest and to detect mobility anomalies in
them. In the presence of anomalies, we also make enhanced traffic volume
predictions by feeding our model with additional information from correlated
regions. Finally, we present and evaluate a mobile app prototype, called
Mobility Data Donors (MDD), in terms of computation, communication, and energy
overhead, demonstrating the real-world deployability of our techniques.Comment: Published at ACM SIGSPATIAL 201
On Collaborative Predictive Blacklisting
Collaborative predictive blacklisting (CPB) allows to forecast future attack
sources based on logs and alerts contributed by multiple organizations.
Unfortunately, however, research on CPB has only focused on increasing the
number of predicted attacks but has not considered the impact on false
positives and false negatives. Moreover, sharing alerts is often hindered by
confidentiality, trust, and liability issues, which motivates the need for
privacy-preserving approaches to the problem. In this paper, we present a
measurement study of state-of-the-art CPB techniques, aiming to shed light on
the actual impact of collaboration. To this end, we reproduce and measure two
systems: a non privacy-friendly one that uses a trusted coordinating party with
access to all alerts (Soldo et al., 2010) and a peer-to-peer one using
privacy-preserving data sharing (Freudiger et al., 2015). We show that, while
collaboration boosts the number of predicted attacks, it also yields high false
positives, ultimately leading to poor accuracy. This motivates us to present a
hybrid approach, using a semi-trusted central entity, aiming to increase
utility from collaboration while, at the same time, limiting information
disclosure and false positives. This leads to a better trade-off of true and
false positive rates, while at the same time addressing privacy concerns.Comment: A preliminary version of this paper appears in ACM SIGCOMM's Computer
Communication Review (Volume 48 Issue 5, October 2018). This is the full
versio
- …