34 research outputs found
Provably Secure Decisions based on Potentially Malicious Information
There are various security-critical decisions routinely made, on the basis of information provided by peers: routing messages, user reports, sensor data, navigational information, blockchain updates, etc. Jury theorems were proposed in sociology to make decisions based on information from peers, which assume peers may be mistaken with some probability. We focus on attackers in a system, which manifest as peers that strategically report fake information to manipulate decision making. We define the property of robustness: a lower bound probability of deciding correctly, regardless of what information attackers provide. When peers are independently selected, we propose an optimal, robust decision mechanism called Most Probable Realisation (MPR). When peer collusion affects source selection, we prove that generally it is NP-hard to find an optimal decision scheme. We propose multiple heuristic decision schemes that can achieve optimality for some collusion scenarios
Unifying Threats Against Information Integrity In Participatory Crowd Sensing
This article proposes a unified threat landscape for participatory crowd sensing (P-CS) systems. Specifically, it focuses on attacks from organized malicious actors that may use the knowledge of P-CS platform\u27s operations and exploit algorithmic weaknesses in AI-based methods of event trust, user reputation, decision-making, or recommendation models deployed to preserve information integrity in P-CS. We emphasize on intent driven malicious behaviors by advanced adversaries and how attacks are crafted to achieve those attack impacts. Three directions of the threat model are introduced, such as attack goals, types, and strategies. We expand on how various strategies are linked with different attack types and goals, underscoring formal definition, their relevance, and impact on the P-CS platform
Trustworthy Federated Learning: A Survey
Federated Learning (FL) has emerged as a significant advancement in the field
of Artificial Intelligence (AI), enabling collaborative model training across
distributed devices while maintaining data privacy. As the importance of FL
increases, addressing trustworthiness issues in its various aspects becomes
crucial. In this survey, we provide an extensive overview of the current state
of Trustworthy FL, exploring existing solutions and well-defined pillars
relevant to Trustworthy . Despite the growth in literature on trustworthy
centralized Machine Learning (ML)/Deep Learning (DL), further efforts are
necessary to identify trustworthiness pillars and evaluation metrics specific
to FL models, as well as to develop solutions for computing trustworthiness
levels. We propose a taxonomy that encompasses three main pillars:
Interpretability, Fairness, and Security & Privacy. Each pillar represents a
dimension of trust, further broken down into different notions. Our survey
covers trustworthiness challenges at every level in FL settings. We present a
comprehensive architecture of Trustworthy FL, addressing the fundamental
principles underlying the concept, and offer an in-depth analysis of trust
assessment mechanisms. In conclusion, we identify key research challenges
related to every aspect of Trustworthy FL and suggest future research
directions. This comprehensive survey serves as a valuable resource for
researchers and practitioners working on the development and implementation of
Trustworthy FL systems, contributing to a more secure and reliable AI
landscape.Comment: 45 Pages, 8 Figures, 9 Table
Edge Intelligence : Empowering Intelligence to the Edge of Network
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe
Edge Intelligence : Empowering Intelligence to the Edge of Network
Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe
Quality control and cost management in crowdsourcing
By harvesting online workers’ knowledge, crowdsourcing has become an efficient and cost-effective way to obtain a large amount of labeled data for solving human intelligent tasks (HITs), such as entity resolution and sentiment analysis. Due to the open nature of crowdsourcing, online workers with different knowledge backgrounds may provide conflicting labels to tasks. Therefore, it is a common practice to perform a pre-determined number of assignments, either per task or for all tasks, and aggregate collected labels to infer the true label of tasks. This model could suffer from poor accuracy in case of under-budget or a waste of resource in case of over-budget. In addition, as worker labels are usually aggregated in a voting manner, crowdsourcing systems are vulnerable to strategic Sybil attack, where the attacker may manipulate several robot Sybil workers to share randomized labels for outvoting independent workers and apply various strategies to evade Sybil detection. In this thesis, we are specifically interested in providing a guaranteed aggregation accuracy with minimum worker cost and defending against strategic Sybil attack. In our first work, we assume that workers are independent and honest. By enforcing a specified accuracy threshold on aggregated labels and minimizing the worker cost under this requirement, we formulate the dual requirements for quality and cost as a Guaranteed Accuracy Problem (GAP) and present an efficient task assignment algorithm for solving the problem. In our second work, we assume that strategic Sybil attackers may coordinate Sybil workers to obtain rewards without honestly labeling tasks and apply different strategies to evade detection. By camouflaging golden tasks (i.e., tasks with known true labels) from the attacker and suppressing the impact of Sybil workers and low-quality independent workers, we extend the principled truth discovery to defend against strategic Sybil attack in crowdsorucing. For both works, we conduct comprehensive empirical evaluations on real and synthetic datasets to demonstrate the effectiveness and efficiency of our methods
Revealing the Landscape of Privacy-Enhancing Technologies in the Context of Data Markets for the IoT: A Systematic Literature Review
IoT data markets in public and private institutions have become increasingly
relevant in recent years because of their potential to improve data
availability and unlock new business models. However, exchanging data in
markets bears considerable challenges related to disclosing sensitive
information. Despite considerable research focused on different aspects of
privacy-enhancing data markets for the IoT, none of the solutions proposed so
far seems to find a practical adoption. Thus, this study aims to organize the
state-of-the-art solutions, analyze and scope the technologies that have been
suggested in this context, and structure the remaining challenges to determine
areas where future research is required. To accomplish this goal, we conducted
a systematic literature review on privacy enhancement in data markets for the
IoT, covering 50 publications dated up to July 2020, and provided updates with
24 publications dated up to May 2022. Our results indicate that most research
in this area has emerged only recently, and no IoT data market architecture has
established itself as canonical. Existing solutions frequently lack the
required combination of anonymization and secure computation technologies.
Furthermore, there is no consensus on the appropriate use of blockchain
technology for IoT data markets and a low degree of leveraging existing
libraries or reusing generic data market architectures. We also identified
significant challenges remaining, such as the copy problem and the recursive
enforcement problem that-while solutions have been suggested to some extent-are
often not sufficiently addressed in proposed designs. We conclude that
privacy-enhancing technologies need further improvements to positively impact
data markets so that, ultimately, the value of data is preserved through data
scarcity and users' privacy and businesses-critical information are protected.Comment: 49 pages, 17 figures, 11 table
Federated Learning in Mobile Edge Networks: A Comprehensive Survey
In recent years, mobile devices are equipped with increasingly advanced
sensing and computing capabilities. Coupled with advancements in Deep Learning
(DL), this opens up countless possibilities for meaningful applications.
Traditional cloudbased Machine Learning (ML) approaches require the data to be
centralized in a cloud server or data center. However, this results in critical
issues related to unacceptable latency and communication inefficiency. To this
end, Mobile Edge Computing (MEC) has been proposed to bring intelligence closer
to the edge, where data is produced. However, conventional enabling
technologies for ML at mobile edge networks still require personal data to be
shared with external parties, e.g., edge servers. Recently, in light of
increasingly stringent data privacy legislations and growing privacy concerns,
the concept of Federated Learning (FL) has been introduced. In FL, end devices
use their local data to train an ML model required by the server. The end
devices then send the model updates rather than raw data to the server for
aggregation. FL can serve as an enabling technology in mobile edge networks
since it enables the collaborative training of an ML model and also enables DL
for mobile edge network optimization. However, in a large-scale and complex
mobile edge network, heterogeneous devices with varying constraints are
involved. This raises challenges of communication costs, resource allocation,
and privacy and security in the implementation of FL at scale. In this survey,
we begin with an introduction to the background and fundamentals of FL. Then,
we highlight the aforementioned challenges of FL implementation and review
existing solutions. Furthermore, we present the applications of FL for mobile
edge network optimization. Finally, we discuss the important challenges and
future research directions in F
Revealing the landscape of privacy-enhancing technologies in the context of data markets for the IoT: A systematic literature review
IoT data markets in public and private institutions have become increasingly relevant in recent years because of their potential to improve data availability and unlock new business models. However, exchanging data in markets bears considerable challenges related to disclosing sensitive information. Despite considerable research focused on different aspects of privacy-enhancing data markets for the IoT, none of the solutions proposed so far seems to find a practical adoption. Thus, this study aims to organize the state-of-the-art solutions, analyze and scope the technologies that have been suggested in this context, and structure the remaining challenges to determine areas where future research is required. To accomplish this goal, we conducted a systematic literature review on privacy enhancement in data markets for the IoT, covering 50 publications dated up to July 2020, and provided updates with 24 publications dated up to May 2022. Our results indicate that most research in this area has emerged only recently, and no IoT data market architecture has established itself as canonical. Existing solutions frequently lack the required combination of anonymization and secure computation technologies. Furthermore, there is no consensus on the appropriate use of blockchain technology for IoT data markets and a low degree of leveraging existing libraries or reusing generic data market architectures. We also identified significant challenges remaining, such as the copy problem and the recursive enforcement problem that - while solutions have been suggested to some extent - are often not sufficiently addressed in proposed designs. We conclude that privacy-enhancing technologies need further improvements to positively impact data markets so that, ultimately, the value of data is preserved through data scarcity and users' privacy and businesses-critical information are protected