4 research outputs found
Quantifying Location Privacy Leakage from Transaction Prices
Large-scale datasets of consumer behavior might revolutionize the way we gain competitive advantages and increase our knowledge in the respective domains. At the same time, valuable datasets pose potential privacy risks that are difficult to foresee. In this paper we study the impact that the prices from consumers’ purchase histories have on the consumers’ location privacy. We show that using a small set of low-priced product prices from the consumers’ purchase histories, an adversary can determine the country, city, and local retail store where the transaction occurred with high confidence. Our paper demonstrates that even when the product category, precise time of purchase, and currency are removed from the consumers’ purchase history (e.g., for privacy reasons), information about the consumers’ location is leaked. The results are based on three independent datasets containing thousands of low-priced and frequently-bought consumer products. In addition, we show how to identify the local currency, given only the total price of a consumer purchase in a global currency (e.g., in Bitcoin). The results show the existence of location privacy risks when releasing consumer purchase histories. As such, the results highlight the need for systems that hide transaction details in consumer purchase histories
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms
Artificial Intelligence (AI) has made its way into various scientific fields,
providing astonishing improvements over existing algorithms for a wide variety
of tasks. In recent years, there have been severe concerns over the
trustworthiness of AI technologies. The scientific community has focused on the
development of trustworthy AI algorithms. However, machine and deep learning
algorithms, popular in the AI community today, depend heavily on the data used
during their development. These learning algorithms identify patterns in the
data, learning the behavioral objective. Any flaws in the data have the
potential to translate directly into algorithms. In this study, we discuss the
importance of Responsible Machine Learning Datasets and propose a framework to
evaluate the datasets through a responsible rubric. While existing work focuses
on the post-hoc evaluation of algorithms for their trustworthiness, we provide
a framework that considers the data component separately to understand its role
in the algorithm. We discuss responsible datasets through the lens of fairness,
privacy, and regulatory compliance and provide recommendations for constructing
future datasets. After surveying over 100 datasets, we use 60 datasets for
analysis and demonstrate that none of these datasets is immune to issues of
fairness, privacy preservation, and regulatory compliance. We provide
modifications to the ``datasheets for datasets" with important additions for
improved dataset documentation. With governments around the world regularizing
data protection laws, the method for the creation of datasets in the scientific
community requires revision. We believe this study is timely and relevant in
today's era of AI.Comment: corrected typo
Censorship-Resilient and Confidential Collateralized Second-Layer Payments
Permissionless blockchains are too slow for applications like
point-of-sale payments. While several techniques have been proposed to
speed up blockchain payments, none of them are satisfactory for application
scenarios like retail shopping. In particular, existing solutions like
payment channels require users to lock up significant funds and schemes
based on pre-defined validators enable easy transaction censoring. In this
paper, we develop Quicksilver, the first blockchain payment scheme that
works with practical collaterals and is fast, censorship-resilient, and confidential
at the same time.We implement Quicksilver for EVM-compatible
chains and show that censoring-resilient payments are fast and affordable
on currently popular blockchains platforms like Ethereum and Polygon