296 research outputs found
Personal Data Trading Scheme for Data Brokers in IoT Data Marketplaces
With the widespread use of IoT, data-driven services take the lead of both online and offline businesses. Especially, personal data draw heavy attention to service providers because of usefulness in value-added services. With the emerging big-data technology, a data broker appears, which exploits and sells personal data about individuals to other third parties. Due to little transparency between providers and brokers/consumers, people think the current ecosystem is not trustworthy; and new regulations with strengthening the rights of individuals were introduced. Therefore, people have an interest in their privacy valuation. In this sense, the willingness-to-sell (WTS) of providers becomes one of the important aspects for data brokers; however, conventional studies have mainly focused on the willingness-to-buy (WTB) of consumers. Therefore, this paper proposes an optimized trading model for data brokers which buy personal data with proper incentives based on the WTS, and they sell valuable information from the refined dataset by considering the WTB and the dataset quality. This paper shows that the proposed model has the global optimal point by the convex optimization technique and proposes a gradient ascent based algorithm. Consequently, it shows that the proposed model is feasible even if data brokers spend costs to gather personal data
Data Trading and Monetization: Challenges and Open Research Directions
Traditional data monetization approaches face challenges related to data
protection and logistics. In response, digital data marketplaces have emerged
as intermediaries simplifying data transactions. Despite the growing
establishment and acceptance of digital data marketplaces, significant
challenges hinder efficient data trading. As a result, few companies can derive
tangible value from their data, leading to missed opportunities in
understanding customers, pricing decisions, and fraud prevention. In this
paper, we explore both technical and organizational challenges affecting data
monetization. Moreover, we identify areas in need of further research, aiming
to expand the boundaries of current knowledge by emphasizing where research is
currently limited or lacking.Comment: Paper accepted by the International Conference on Future Networks and
Distributed Systems (ICFNDS 2023
Decentralized brokered enabled ecosystem for data marketplace in smart cities towards a data sharing economy
Presently data are indispensably important as cities consider data as a commodity which can be traded to earn revenues. In urban environment, data generated from internet of things devices, smart meters, smart sensors, etc. can provide a new source of income for citizens and enterprises who are data owners. These data can be traded as digital assets. To support such trading digital data marketplaces have emerged. Data marketplaces promote a data sharing economy which is crucial for provision of available data useful for cities which aims to develop data driven services. But currently existing data marketplaces are mostly inadequate due to several issues such as security, efficiency, and adherence to privacy regulations. Likewise, there is no consolidated understanding of how to achieve trust and fairness among data owners and data sellers when trading data. Therefore, this study presents the design of an ecosystem which comprises of a distributed ledger technology data marketplace enabled by message queueing telemetry transport (MQTT) to facilitate trust and fairness among data owners and data sellers. The designed ecosystem for data marketplaces is powered by IOTA technology and MQTT broker to support the trading of sdata sources by automating trade agreements, negotiations and payment settlement between data producers/sellers and data consumers/buyers. Overall, findings from this article discuss the issues associated in developing a decentralized data marketplace for smart cities suggesting recommendations to enhance the deployment of decentralized and distributed data marketplaces.publishedVersio
Towards a human-centric data economy
Spurred by widespread adoption of artificial intelligence and machine learning, “data” is becoming
a key production factor, comparable in importance to capital, land, or labour in an increasingly
digital economy. In spite of an ever-growing demand for third-party data in the B2B
market, firms are generally reluctant to share their information. This is due to the unique characteristics
of “data” as an economic good (a freely replicable, non-depletable asset holding a highly
combinatorial and context-specific value), which moves digital companies to hoard and protect
their “valuable” data assets, and to integrate across the whole value chain seeking to monopolise
the provision of innovative services built upon them. As a result, most of those valuable assets
still remain unexploited in corporate silos nowadays.
This situation is shaping the so-called data economy around a number of champions, and it is
hampering the benefits of a global data exchange on a large scale. Some analysts have estimated
the potential value of the data economy in US$2.5 trillion globally by 2025. Not surprisingly, unlocking
the value of data has become a central policy of the European Union, which also estimated
the size of the data economy in 827C billion for the EU27 in the same period. Within the scope of
the European Data Strategy, the European Commission is also steering relevant initiatives aimed
to identify relevant cross-industry use cases involving different verticals, and to enable sovereign
data exchanges to realise them.
Among individuals, the massive collection and exploitation of personal data by digital firms
in exchange of services, often with little or no consent, has raised a general concern about privacy
and data protection. Apart from spurring recent legislative developments in this direction,
this concern has raised some voices warning against the unsustainability of the existing digital
economics (few digital champions, potential negative impact on employment, growing inequality),
some of which propose that people are paid for their data in a sort of worldwide data labour
market as a potential solution to this dilemma [114, 115, 155].
From a technical perspective, we are far from having the required technology and algorithms
that will enable such a human-centric data economy. Even its scope is still blurry, and the question
about the value of data, at least, controversial. Research works from different disciplines have
studied the data value chain, different approaches to the value of data, how to price data assets,
and novel data marketplace designs. At the same time, complex legal and ethical issues with
respect to the data economy have risen around privacy, data protection, and ethical AI practices. In this dissertation, we start by exploring the data value chain and how entities trade data assets
over the Internet. We carry out what is, to the best of our understanding, the most thorough survey
of commercial data marketplaces. In this work, we have catalogued and characterised ten different
business models, including those of personal information management systems, companies born
in the wake of recent data protection regulations and aiming at empowering end users to take
control of their data. We have also identified the challenges faced by different types of entities,
and what kind of solutions and technology they are using to provide their services.
Then we present a first of its kind measurement study that sheds light on the prices of data
in the market using a novel methodology. We study how ten commercial data marketplaces categorise
and classify data assets, and which categories of data command higher prices. We also
develop classifiers for comparing data products across different marketplaces, and we study the
characteristics of the most valuable data assets and the features that specific vendors use to set
the price of their data products. Based on this information and adding data products offered by
other 33 data providers, we develop a regression analysis for revealing features that correlate with
prices of data products. As a result, we also implement the basic building blocks of a novel data
pricing tool capable of providing a hint of the market price of a new data product using as inputs
just its metadata. This tool would provide more transparency on the prices of data products in
the market, which will help in pricing data assets and in avoiding the inherent price fluctuation of
nascent markets.
Next we turn to topics related to data marketplace design. Particularly, we study how buyers
can select and purchase suitable data for their tasks without requiring a priori access to such
data in order to make a purchase decision, and how marketplaces can distribute payoffs for a
data transaction combining data of different sources among the corresponding providers, be they
individuals or firms. The difficulty of both problems is further exacerbated in a human-centric
data economy where buyers have to choose among data of thousands of individuals, and where
marketplaces have to distribute payoffs to thousands of people contributing personal data to a
specific transaction.
Regarding the selection process, we compare different purchase strategies depending on the
level of information available to data buyers at the time of making decisions. A first methodological
contribution of our work is proposing a data evaluation stage prior to datasets being selected
and purchased by buyers in a marketplace. We show that buyers can significantly improve the
performance of the purchasing process just by being provided with a measurement of the performance
of their models when trained by the marketplace with individual eligible datasets. We
design purchase strategies that exploit such functionality and we call the resulting algorithm Try
Before You Buy, and our work demonstrates over synthetic and real datasets that it can lead to
near-optimal data purchasing with only O(N) instead of the exponential execution time - O(2N)
- needed to calculate the optimal purchase. With regards to the payoff distribution problem, we focus on computing the relative value
of spatio-temporal datasets combined in marketplaces for predicting transportation demand and
travel time in metropolitan areas. Using large datasets of taxi rides from Chicago, Porto and
New York we show that the value of data is different for each individual, and cannot be approximated
by its volume. Our results reveal that even more complex approaches based on the
“leave-one-out” value, are inaccurate. Instead, more complex and acknowledged notions of value
from economics and game theory, such as the Shapley value, need to be employed if one wishes
to capture the complex effects of mixing different datasets on the accuracy of forecasting algorithms.
However, the Shapley value entails serious computational challenges. Its exact calculation
requires repetitively training and evaluating every combination of data sources and hence O(N!)
or O(2N) computational time, which is unfeasible for complex models or thousands of individuals.
Moreover, our work paves the way to new methods of measuring the value of spatio-temporal
data. We identify heuristics such as entropy or similarity to the average that show a significant
correlation with the Shapley value and therefore can be used to overcome the significant computational
challenges posed by Shapley approximation algorithms in this specific context.
We conclude with a number of open issues and propose further research directions that leverage
the contributions and findings of this dissertation. These include monitoring data transactions
to better measure data markets, and complementing market data with actual transaction prices
to build a more accurate data pricing tool. A human-centric data economy would also require
that the contributions of thousands of individuals to machine learning tasks are calculated daily.
For that to be feasible, we need to further optimise the efficiency of data purchasing and payoff
calculation processes in data marketplaces. In that direction, we also point to some alternatives
to repetitively training and evaluating a model to select data based on Try Before You Buy and
approximate the Shapley value. Finally, we discuss the challenges and potential technologies that
help with building a federation of standardised data marketplaces.
The data economy will develop fast in the upcoming years, and researchers from different
disciplines will work together to unlock the value of data and make the most out of it. Maybe
the proposal of getting paid for our data and our contribution to the data economy finally flies,
or maybe it is other proposals such as the robot tax that are finally used to balance the power
between individuals and tech firms in the digital economy. Still, we hope our work sheds light on
the value of data, and contributes to making the price of data more transparent and, eventually, to
moving towards a human-centric data economy.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Georgios Smaragdakis.- Secretario: Ángel Cuevas Rumín.- Vocal: Pablo Rodríguez Rodrígue
Revealing the Landscape of Privacy-Enhancing Technologies in the Context of Data Markets for the IoT: A Systematic Literature Review
IoT data markets in public and private institutions have become increasingly
relevant in recent years because of their potential to improve data
availability and unlock new business models. However, exchanging data in
markets bears considerable challenges related to disclosing sensitive
information. Despite considerable research focused on different aspects of
privacy-enhancing data markets for the IoT, none of the solutions proposed so
far seems to find a practical adoption. Thus, this study aims to organize the
state-of-the-art solutions, analyze and scope the technologies that have been
suggested in this context, and structure the remaining challenges to determine
areas where future research is required. To accomplish this goal, we conducted
a systematic literature review on privacy enhancement in data markets for the
IoT, covering 50 publications dated up to July 2020, and provided updates with
24 publications dated up to May 2022. Our results indicate that most research
in this area has emerged only recently, and no IoT data market architecture has
established itself as canonical. Existing solutions frequently lack the
required combination of anonymization and secure computation technologies.
Furthermore, there is no consensus on the appropriate use of blockchain
technology for IoT data markets and a low degree of leveraging existing
libraries or reusing generic data market architectures. We also identified
significant challenges remaining, such as the copy problem and the recursive
enforcement problem that-while solutions have been suggested to some extent-are
often not sufficiently addressed in proposed designs. We conclude that
privacy-enhancing technologies need further improvements to positively impact
data markets so that, ultimately, the value of data is preserved through data
scarcity and users' privacy and businesses-critical information are protected.Comment: 49 pages, 17 figures, 11 table
Competitive Data Trading Model with Privacy Valuation for Multiple Stakeholders in IoT Data Markets
With the widespread of Internet of Things (IoT) environment, a big data concept has emerged to handle a large number of data generated by IoT devices. Moreover, since data- driven approaches now become important for business, IoT data markets have emerged, and IoT big data are exploited by major stakeholders such as data brokers and data service providers. Since many services and applications utilize data analytic methods with collected data from IoT devices, the conflict issues between privacy and data exploitation are raised, and the markets are mainly categorized as privacy protection markets and privacy valuation markets, respectively. Since these kinds of data value chains (which are mainly considered by business stakeholders) are revealed, data providers are interested in proper incentives in exchange for their privacy (i.e., privacy valuation) under their agreement. Therefore, this paper proposes a competitive data trading model that consists of data providers who weigh the value between privacy protection and valuation as well as other business stakeholders. Each data broker considers the willingness-to-sell of data providers, and a single data service provider considers the willingness-to-pay of service consumers. At the same time, multiple data brokers compete to sell their dataset to the data service provider as a non-cooperative game model. Based on the Nash Equilibrium analysis (NE) of the game, the feasibility is shown that the proposed model has the unique NE that maximizes the profits of business stakeholders while satisfying all market participants
Revealing the landscape of privacy-enhancing technologies in the context of data markets for the IoT: A systematic literature review
IoT data markets in public and private institutions have become increasingly relevant in recent years because of their potential to improve data availability and unlock new business models. However, exchanging data in markets bears considerable challenges related to disclosing sensitive information. Despite considerable research focused on different aspects of privacy-enhancing data markets for the IoT, none of the solutions proposed so far seems to find a practical adoption. Thus, this study aims to organize the state-of-the-art solutions, analyze and scope the technologies that have been suggested in this context, and structure the remaining challenges to determine areas where future research is required. To accomplish this goal, we conducted a systematic literature review on privacy enhancement in data markets for the IoT, covering 50 publications dated up to July 2020, and provided updates with 24 publications dated up to May 2022. Our results indicate that most research in this area has emerged only recently, and no IoT data market architecture has established itself as canonical. Existing solutions frequently lack the required combination of anonymization and secure computation technologies. Furthermore, there is no consensus on the appropriate use of blockchain technology for IoT data markets and a low degree of leveraging existing libraries or reusing generic data market architectures. We also identified significant challenges remaining, such as the copy problem and the recursive enforcement problem that - while solutions have been suggested to some extent - are often not sufficiently addressed in proposed designs. We conclude that privacy-enhancing technologies need further improvements to positively impact data markets so that, ultimately, the value of data is preserved through data scarcity and users' privacy and businesses-critical information are protected
Agora: A Privacy-Aware Data Marketplace
We propose Agora, the first blockchain-based data marketplace that enables multiple privacy-concerned parties to get compensated for contributing and exchanging data, without relying on a trusted third party during the exchange. Agora achieves data privacy, output verifiability, and atomicity of payments by leveraging cryptographic techniques, and is designed as a decentralized application via smart contracts. Particularly, data generators provide encrypted data to data brokers who use a functional secret key to learn nothing but the output of a specific, agreed upon, function over the raw data. Data consumers can purchase decrypted outputs from the brokers, accompanied by corresponding proofs of correctness. We implement a working prototype of Agora on Ethereum and experimentally evaluate its performance and deployment costs. As a core building block of Agora, we propose a new functional encryption scheme with additional public parameters that operate as a trust anchor for verifying decrypted results
On Monetizing Personal Wearable Devices Data: A Blockchain-based Marketplace for Data Crowdsourcing and Federated Machine Learning in Healthcare
Machine learning advancements in healthcare have made data collected through smartphones and wearable devices a vital source of public health and medical insights. While wearable device data helps to monitor, detect, and predict diseases and health conditions, some data owners hesitate to share such sensitive data with companies or researchers due to privacy concerns. Moreover, wearable devices have been recently available as commercial products; thus large, diverse, and representative datasets are not available to most researchers. In this article, we propose an open marketplace where wearable device users securely monetize their wearable device records by sharing data with consumers (e.g., researchers) to make wearable device data more available to healthcare researchers. To secure the data transactions in a privacy-preserving manner, we use a decentralized approach using Blockchain and Non-Fungible Tokens (NFTs). To ensure data originality and integrity with secure validation, our marketplace uses Trusted Execution Environments (TEE) in wearable devices to verify the correctness of health data. The marketplace also allows researchers to train models using Federated Learning with a TEE-backed secure aggregation of data users may not be willing to share. To ensure user participation, we model incentive mechanisms for the Federated Learning-based and anonymized data-sharing approaches using NFTs. We also propose using payment channels and batching to reduce smart contact gas fees and optimize user profits. If widely adopted, we believe that TEE and Blockchain-based incentives will promote the ethical use of machine learning with validated wearable device data in healthcare and improve user participation due to incentives.
Paying for Privacy and the Personal Data Economy
Growing demands for privacy and increases in the quantity and variety of consumer data have engendered various business offerings to allow companies, and in some instances consumers, to capitalize on these developments. One such example is the emerging “personal data economy” (PDE) in which companies, such as Datacoup, purchase data directly from individuals. At the opposite end of the spectrum, the “pay-for-privacy” (PFP) model requires consumers to pay an additional fee to prevent their data from being collected and mined for advertising purposes. This Article conducts a simultaneous in-depth exploration of the impact of burgeoning PDE and PFP models. It identifies a typology of data-business models, and it uncovers thesimilarities and tensions between a data market controlled by established companies that have historically collected and mined consumer data for their primary benefit and one in which consumers play a central role in monetizing their own data. The Article makes three claims. First, it contends that PFP models facilitate thetransformation of privacy into a tradable product in the online setting, may worsen unequal access to privacy, and could further enable predatory and discriminatory behavior. Second, while the PDE may allow consumers to regain a semblance of control over their information by enabling them to decide when and with whom to share their data, consumers’ direct transfer or disclosure of personal data to companies for a price or personalized deals creates challenges similar to those found in the PFP context and generates additional concerns associated with innovative monetization techniques. Third, existing frameworks and proposals may not sufficiently ameliorate these concerns. The Article concludes by offering a path forward
- …