55 research outputs found
Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense
The field of cybersecurity has mostly been a cat-and-mouse game with the
discovery of new attacks leading the way. To take away an attacker's advantage
of reconnaissance, researchers have proposed proactive defense methods such as
Moving Target Defense (MTD). To find good movement strategies, researchers have
modeled MTD as leader-follower games between the defender and a
cyber-adversary. We argue that existing models are inadequate in sequential
settings when there is incomplete information about a rational adversary and
yield sub-optimal movement strategies. Further, while there exists an array of
work on learning defense policies in sequential settings for cyber-security,
they are either unpopular due to scalability issues arising out of incomplete
information or tend to ignore the strategic nature of the adversary simplifying
the scenario to use single-agent reinforcement learning techniques. To address
these concerns, we propose (1) a unifying game-theoretic model, called the
Bayesian Stackelberg Markov Games (BSMGs), that can model uncertainty over
attacker types and the nuances of an MTD system and (2) a Bayesian Strong
Stackelberg Q-learning (BSS-Q) approach that can, via interaction, learn the
optimal movement policy for BSMGs within a reasonable time. We situate BSMGs in
the landscape of incomplete-information Markov games and characterize the
notion of Strong Stackelberg Equilibrium (SSE) in them. We show that our
learning approach converges to an SSE of a BSMG and then highlight that the
learned movement policy (1) improves the state-of-the-art in MTD for
web-application security and (2) converges to an optimal policy in MTD domains
with incomplete information about adversaries even when prior information about
rewards and transitions is absent
Game Theory for Cyber Deception: A Tutorial
Deceptive and anti-deceptive technologies have been developed for various
specific applications. But there is a significant need for a general, holistic,
and quantitative framework of deception. Game theory provides an ideal set of
tools to develop such a framework of deception. In particular, game theory
captures the strategic and self-interested nature of attackers and defenders in
cybersecurity. Additionally, control theory can be used to quantify the
physical impact of attack and defense strategies. In this tutorial, we present
an overview of game-theoretic models and design mechanisms for deception and
counter-deception. The tutorial aims to provide a taxonomy of deception and
counter-deception and understand how they can be conceptualized, quantified,
and designed or mitigated. This tutorial gives an overview of diverse
methodologies from game theory that includes games of incomplete information,
dynamic games, mechanism design theory to offer a modern theoretic underpinning
of cyberdeception. The tutorial will also discuss open problems and research
challenges that the HoTSoS community can address and contribute with an
objective to build a multidisciplinary bridge between cybersecurity, economics,
game and decision theory.Comment: arXiv admin note: substantial text overlap with arXiv:1808.0806
Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense
The increasing instances of advanced attacks call for a new defense paradigm
that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense
paradigm. This chapter introduces three defense schemes that actively interact
with attackers to increase the attack cost and gather threat information, i.e.,
defensive deception for detection and counter-deception, feedback-driven Moving
Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber
deception, external noise, and the absent knowledge of the other players'
behaviors and goals, these schemes possess three progressive levels of
information restrictions, i.e., from the parameter uncertainty, the payoff
uncertainty, to the environmental uncertainty. To estimate the unknown and
reduce uncertainty, we adopt three different strategic learning schemes that
fit the associated information restrictions. All three learning schemes share
the same feedback structure of sensation, estimation, and actions so that the
most rewarding policies get reinforced and converge to the optimal ones in
autonomous and adaptive fashions. This work aims to shed lights on proactive
defense strategies, lay a solid foundation for strategic learning under
incomplete information, and quantify the tradeoff between the security and
costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218
Game Theory in Distributed Systems Security: Foundations, Challenges, and Future Directions
Many of our critical infrastructure systems and personal computing systems
have a distributed computing systems structure. The incentives to attack them
have been growing rapidly as has their attack surface due to increasing levels
of connectedness. Therefore, we feel it is time to bring in rigorous reasoning
to secure such systems. The distributed system security and the game theory
technical communities can come together to effectively address this challenge.
In this article, we lay out the foundations from each that we can build upon to
achieve our goals. Next, we describe a set of research challenges for the
community, organized into three categories -- analytical, systems, and
integration challenges, each with "short term" time horizon (2-3 years) and
"long term" (5-10 years) items. This article was conceived of through a
community discussion at the 2022 NSF SaTC PI meeting.Comment: 11 pages in IEEE Computer Society magazine format, including
references and author bios. There is 1 figur
Multi-Agent Learning for Security and Sustainability
This thesis studies the application of multi-agent learning in complex domains where safety and sustainability are crucial. We target some of the main obstacles in the deployment of multi-agent learning techniques in such domains. These obstacles consist of modelling complex environments with multi-agent interaction, designing robust learning processes and modelling adversarial agents. The main goal of using modern multi-agent learning methods is to improve the effectiveness of behaviour in such domains, and hence increase sustainability and security. This thesis investigates three complex real-world domains: space debris removal, critical domains with risky states and spatial security domains such as illegal rhino poaching. We first tackle the challenge of modelling a complex multi-agent environment. The focus is on the space debris removal problem, which poses a major threat to the sustainability of earth orbit. We develop a high-fidelity space debris simulator that allows us to simulate the future evolution of the space debris environment. Using the data from the simulator we propose a surrogate model, which enables fast evaluation of different strategies chosen by the space actors. We then analyse the dynamics of strategic decision making among multiple space actors, comparing different models of agent interaction: static vs. dynamic and centralised vs. decentralised. The outcome of our work can help future decision makers to design debris removal strategies, and consequently mitigate the threat of space debris. Next, we study how we can design a robust learning process in critical domains with risky states, where destabilisation of local components can lead to severe impact on the whole network. We propose a novel robust operator κ which can be combined with reinforcement learning methods, leading to learning safe policies, mitigating the threat of external attack, or failure in the system. Finally, we investigate the challenge of learning an effective behaviour while facing adversarial attackers in spatial security domains such as illegal rhino poaching. We assume that such attackers can be occasionally observed. Our approach consists of combining Bayesian inference with temporal difference learning, in order to build a model of the attacker behaviour. Our method can effectively use the partial observability of the attacker’s location and approximate the performance of a full observability case. This thesis therefore presents novel methods and tackles several important obstacles in deploying multi-agent learning algorithms in the real-world, which further narrows the reality gap between theoretical models and real-world applications
Contributions to the Security of Machine Learning
Machine learning (ML) applications have experienced an unprecedented growth over the last two decades. However, the ever increasing adoption of ML methodologies has revealed important security issues. Among these, vulnerabilities to adversarial examples, data instances targeted at fooling ML algorithms, are especially important. Examples abound. For instance, it is relatively easy to fool a spam detector simply misspelling spam words. Obfuscation of malware code can make it seem legitimate. Simply adding stickers to a stop sign could make an autonomous vehicle classify it as a merge sign. Consequences could be catastrophic. Indeed, ML is designed to work in stationary and benign environments. However, in certain scenarios, the presence of adversaries that actively manipulate input datato fool ML systems to attain benefits break such stationarity requirements. Training and operation conditions are not identical anymore. This creates a whole new class of security vulnerabilities that ML systems may face and a new desirable property: adversarial robustness. If we are to trust operations based on ML outputs, it becomes essential that learning systems are robust to such adversarial manipulations..
Adversarial Security Attacks and Perturbations on Machine Learning and Deep Learning Methods
The ever-growing big data and emerging artificial intelligence (AI) demand
the use of machine learning (ML) and deep learning (DL) methods. Cybersecurity
also benefits from ML and DL methods for various types of applications. These
methods however are susceptible to security attacks. The adversaries can
exploit the training and testing data of the learning models or can explore the
workings of those models for launching advanced future attacks. The topic of
adversarial security attacks and perturbations within the ML and DL domains is
a recent exploration and a great interest is expressed by the security
researchers and practitioners. The literature covers different adversarial
security attacks and perturbations on ML and DL methods and those have their
own presentation styles and merits. A need to review and consolidate knowledge
that is comprehending of this increasingly focused and growing topic of
research; however, is the current demand of the research communities. In this
review paper, we specifically aim to target new researchers in the
cybersecurity domain who may seek to acquire some basic knowledge on the
machine learning and deep learning models and algorithms, as well as some of
the relevant adversarial security attacks and perturbations
Game Theoretic Approaches in Vehicular Networks: A Survey
In the era of the Internet of Things (IoT), vehicles and other intelligent
components in Intelligent Transportation System (ITS) are connected, forming
the Vehicular Networks (VNs) that provide efficient and secure traffic,
ubiquitous access to information, and various applications. However, as the
number of connected nodes keeps increasing, it is challenging to satisfy
various and large amounts of service requests with different Quality of Service
(QoS ) and security requirements in the highly dynamic VNs. Intelligent nodes
in VNs can compete or cooperate for limited network resources so that either an
individual or group objectives can be achieved. Game theory, a theoretical
framework designed for strategic interactions among rational decision-makers
who faced with scarce resources, can be used to model and analyze individual or
group behaviors of communication entities in VNs. This paper primarily surveys
the recent advantages of GT used in solving various challenges in VNs. As VNs
and GT have been extensively investigate34d, this survey starts with a brief
introduction of the basic concept and classification of GT used in VNs. Then, a
comprehensive review of applications of GT in VNs is presented, which primarily
covers the aspects of QoS and security. Moreover, with the development of
fifth-generation (5G) wireless communication, recent contributions of GT to
diverse emerging technologies of 5G integrated into VNs are surveyed in this
paper. Finally, several key research challenges and possible solutions for
applying GT in VNs are outlined
Artificial Intelligence for Social Good: A Survey
Artificial intelligence for social good (AI4SG) is a research theme that aims
to use and advance artificial intelligence to address societal issues and
improve the well-being of the world. AI4SG has received lots of attention from
the research community in the past decade with several successful applications.
Building on the most comprehensive collection of the AI4SG literature to date
with over 1000 contributed papers, we provide a detailed account and analysis
of the work under the theme in the following ways. (1) We quantitatively
analyze the distribution and trend of the AI4SG literature in terms of
application domains and AI techniques used. (2) We propose three conceptual
methods to systematically group the existing literature and analyze the eight
AI4SG application domains in a unified framework. (3) We distill five research
topics that represent the common challenges in AI4SG across various application
domains. (4) We discuss five issues that, we hope, can shed light on the future
development of the AI4SG research
Making friends on the fly : advances in ad hoc teamwork
textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations.
The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones.
We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science
- …