55 research outputs found

    Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense

    Full text link
    The field of cybersecurity has mostly been a cat-and-mouse game with the discovery of new attacks leading the way. To take away an attacker's advantage of reconnaissance, researchers have proposed proactive defense methods such as Moving Target Defense (MTD). To find good movement strategies, researchers have modeled MTD as leader-follower games between the defender and a cyber-adversary. We argue that existing models are inadequate in sequential settings when there is incomplete information about a rational adversary and yield sub-optimal movement strategies. Further, while there exists an array of work on learning defense policies in sequential settings for cyber-security, they are either unpopular due to scalability issues arising out of incomplete information or tend to ignore the strategic nature of the adversary simplifying the scenario to use single-agent reinforcement learning techniques. To address these concerns, we propose (1) a unifying game-theoretic model, called the Bayesian Stackelberg Markov Games (BSMGs), that can model uncertainty over attacker types and the nuances of an MTD system and (2) a Bayesian Strong Stackelberg Q-learning (BSS-Q) approach that can, via interaction, learn the optimal movement policy for BSMGs within a reasonable time. We situate BSMGs in the landscape of incomplete-information Markov games and characterize the notion of Strong Stackelberg Equilibrium (SSE) in them. We show that our learning approach converges to an SSE of a BSMG and then highlight that the learned movement policy (1) improves the state-of-the-art in MTD for web-application security and (2) converges to an optimal policy in MTD domains with incomplete information about adversaries even when prior information about rewards and transitions is absent

    Game Theory for Cyber Deception: A Tutorial

    Full text link
    Deceptive and anti-deceptive technologies have been developed for various specific applications. But there is a significant need for a general, holistic, and quantitative framework of deception. Game theory provides an ideal set of tools to develop such a framework of deception. In particular, game theory captures the strategic and self-interested nature of attackers and defenders in cybersecurity. Additionally, control theory can be used to quantify the physical impact of attack and defense strategies. In this tutorial, we present an overview of game-theoretic models and design mechanisms for deception and counter-deception. The tutorial aims to provide a taxonomy of deception and counter-deception and understand how they can be conceptualized, quantified, and designed or mitigated. This tutorial gives an overview of diverse methodologies from game theory that includes games of incomplete information, dynamic games, mechanism design theory to offer a modern theoretic underpinning of cyberdeception. The tutorial will also discuss open problems and research challenges that the HoTSoS community can address and contribute with an objective to build a multidisciplinary bridge between cybersecurity, economics, game and decision theory.Comment: arXiv admin note: substantial text overlap with arXiv:1808.0806

    Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

    Full text link
    The increasing instances of advanced attacks call for a new defense paradigm that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense paradigm. This chapter introduces three defense schemes that actively interact with attackers to increase the attack cost and gather threat information, i.e., defensive deception for detection and counter-deception, feedback-driven Moving Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber deception, external noise, and the absent knowledge of the other players' behaviors and goals, these schemes possess three progressive levels of information restrictions, i.e., from the parameter uncertainty, the payoff uncertainty, to the environmental uncertainty. To estimate the unknown and reduce uncertainty, we adopt three different strategic learning schemes that fit the associated information restrictions. All three learning schemes share the same feedback structure of sensation, estimation, and actions so that the most rewarding policies get reinforced and converge to the optimal ones in autonomous and adaptive fashions. This work aims to shed lights on proactive defense strategies, lay a solid foundation for strategic learning under incomplete information, and quantify the tradeoff between the security and costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218

    Game Theory in Distributed Systems Security: Foundations, Challenges, and Future Directions

    Full text link
    Many of our critical infrastructure systems and personal computing systems have a distributed computing systems structure. The incentives to attack them have been growing rapidly as has their attack surface due to increasing levels of connectedness. Therefore, we feel it is time to bring in rigorous reasoning to secure such systems. The distributed system security and the game theory technical communities can come together to effectively address this challenge. In this article, we lay out the foundations from each that we can build upon to achieve our goals. Next, we describe a set of research challenges for the community, organized into three categories -- analytical, systems, and integration challenges, each with "short term" time horizon (2-3 years) and "long term" (5-10 years) items. This article was conceived of through a community discussion at the 2022 NSF SaTC PI meeting.Comment: 11 pages in IEEE Computer Society magazine format, including references and author bios. There is 1 figur

    Multi-Agent Learning for Security and Sustainability

    Get PDF
    This thesis studies the application of multi-agent learning in complex domains where safety and sustainability are crucial. We target some of the main obstacles in the deployment of multi-agent learning techniques in such domains. These obstacles consist of modelling complex environments with multi-agent interaction, designing robust learning processes and modelling adversarial agents. The main goal of using modern multi-agent learning methods is to improve the effectiveness of behaviour in such domains, and hence increase sustainability and security. This thesis investigates three complex real-world domains: space debris removal, critical domains with risky states and spatial security domains such as illegal rhino poaching. We first tackle the challenge of modelling a complex multi-agent environment. The focus is on the space debris removal problem, which poses a major threat to the sustainability of earth orbit. We develop a high-fidelity space debris simulator that allows us to simulate the future evolution of the space debris environment. Using the data from the simulator we propose a surrogate model, which enables fast evaluation of different strategies chosen by the space actors. We then analyse the dynamics of strategic decision making among multiple space actors, comparing different models of agent interaction: static vs. dynamic and centralised vs. decentralised. The outcome of our work can help future decision makers to design debris removal strategies, and consequently mitigate the threat of space debris. Next, we study how we can design a robust learning process in critical domains with risky states, where destabilisation of local components can lead to severe impact on the whole network. We propose a novel robust operator κ which can be combined with reinforcement learning methods, leading to learning safe policies, mitigating the threat of external attack, or failure in the system. Finally, we investigate the challenge of learning an effective behaviour while facing adversarial attackers in spatial security domains such as illegal rhino poaching. We assume that such attackers can be occasionally observed. Our approach consists of combining Bayesian inference with temporal difference learning, in order to build a model of the attacker behaviour. Our method can effectively use the partial observability of the attacker’s location and approximate the performance of a full observability case. This thesis therefore presents novel methods and tackles several important obstacles in deploying multi-agent learning algorithms in the real-world, which further narrows the reality gap between theoretical models and real-world applications

    Contributions to the Security of Machine Learning

    Get PDF
    Machine learning (ML) applications have experienced an unprecedented growth over the last two decades. However, the ever increasing adoption of ML methodologies has revealed important security issues. Among these, vulnerabilities to adversarial examples, data instances targeted at fooling ML algorithms, are especially important. Examples abound. For instance, it is relatively easy to fool a spam detector simply misspelling spam words. Obfuscation of malware code can make it seem legitimate. Simply adding stickers to a stop sign could make an autonomous vehicle classify it as a merge sign. Consequences could be catastrophic. Indeed, ML is designed to work in stationary and benign environments. However, in certain scenarios, the presence of adversaries that actively manipulate input datato fool ML systems to attain benefits break such stationarity requirements. Training and operation conditions are not identical anymore. This creates a whole new class of security vulnerabilities that ML systems may face and a new desirable property: adversarial robustness. If we are to trust operations based on ML outputs, it becomes essential that learning systems are robust to such adversarial manipulations..

    Adversarial Security Attacks and Perturbations on Machine Learning and Deep Learning Methods

    Full text link
    The ever-growing big data and emerging artificial intelligence (AI) demand the use of machine learning (ML) and deep learning (DL) methods. Cybersecurity also benefits from ML and DL methods for various types of applications. These methods however are susceptible to security attacks. The adversaries can exploit the training and testing data of the learning models or can explore the workings of those models for launching advanced future attacks. The topic of adversarial security attacks and perturbations within the ML and DL domains is a recent exploration and a great interest is expressed by the security researchers and practitioners. The literature covers different adversarial security attacks and perturbations on ML and DL methods and those have their own presentation styles and merits. A need to review and consolidate knowledge that is comprehending of this increasingly focused and growing topic of research; however, is the current demand of the research communities. In this review paper, we specifically aim to target new researchers in the cybersecurity domain who may seek to acquire some basic knowledge on the machine learning and deep learning models and algorithms, as well as some of the relevant adversarial security attacks and perturbations

    Game Theoretic Approaches in Vehicular Networks: A Survey

    Full text link
    In the era of the Internet of Things (IoT), vehicles and other intelligent components in Intelligent Transportation System (ITS) are connected, forming the Vehicular Networks (VNs) that provide efficient and secure traffic, ubiquitous access to information, and various applications. However, as the number of connected nodes keeps increasing, it is challenging to satisfy various and large amounts of service requests with different Quality of Service (QoS ) and security requirements in the highly dynamic VNs. Intelligent nodes in VNs can compete or cooperate for limited network resources so that either an individual or group objectives can be achieved. Game theory, a theoretical framework designed for strategic interactions among rational decision-makers who faced with scarce resources, can be used to model and analyze individual or group behaviors of communication entities in VNs. This paper primarily surveys the recent advantages of GT used in solving various challenges in VNs. As VNs and GT have been extensively investigate34d, this survey starts with a brief introduction of the basic concept and classification of GT used in VNs. Then, a comprehensive review of applications of GT in VNs is presented, which primarily covers the aspects of QoS and security. Moreover, with the development of fifth-generation (5G) wireless communication, recent contributions of GT to diverse emerging technologies of 5G integrated into VNs are surveyed in this paper. Finally, several key research challenges and possible solutions for applying GT in VNs are outlined

    Artificial Intelligence for Social Good: A Survey

    Full text link
    Artificial intelligence for social good (AI4SG) is a research theme that aims to use and advance artificial intelligence to address societal issues and improve the well-being of the world. AI4SG has received lots of attention from the research community in the past decade with several successful applications. Building on the most comprehensive collection of the AI4SG literature to date with over 1000 contributed papers, we provide a detailed account and analysis of the work under the theme in the following ways. (1) We quantitatively analyze the distribution and trend of the AI4SG literature in terms of application domains and AI techniques used. (2) We propose three conceptual methods to systematically group the existing literature and analyze the eight AI4SG application domains in a unified framework. (3) We distill five research topics that represent the common challenges in AI4SG across various application domains. (4) We discuss five issues that, we hope, can shed light on the future development of the AI4SG research

    Making friends on the fly : advances in ad hoc teamwork

    Get PDF
    textGiven the continuing improvements in design and manufacturing processes in addition to improvements in artificial intelligence, robots are being deployed in an increasing variety of environments for longer periods of time. As the number of robots grows, it is expected that they will encounter and interact with other robots. Additionally, the number of companies and research laboratories producing these robots is increasing, leading to the situation where these robots may not share a common communication or coordination protocol. While standards for coordination and communication may be created, we expect that any standards will lag behind the state-of-the-art protocols and robots will need to additionally reason intelligently about their teammates with limited information. This problem motivates the area of ad hoc teamwork in which an agent may potentially cooperate with a variety of teammates in order to achieve a shared goal. We argue that agents that effectively reason about ad hoc teamwork need to exhibit three capabilities: 1) robustness to teammate variety, 2) robustness to diverse tasks, and 3) fast adaptation. This thesis focuses on addressing all three of these challenges. In particular, this thesis introduces algorithms for quickly adapting to unknown teammates that enable agents to react to new teammates without extensive observations. The majority of existing multiagent algorithms focus on scenarios where all agents share coordination and communication protocols. While previous research on ad hoc teamwork considers some of these three challenges, this thesis introduces a new algorithm, PLASTIC, that is the first to address all three challenges in a single algorithm. PLASTIC adapts quickly to unknown teammates by reusing knowledge it learns about previous teammates and exploiting any expert knowledge available. Given this knowledge, PLASTIC selects which previous teammates are most similar to the current ones online and uses this information to adapt to their behaviors. This thesis introduces two instantiations of PLASTIC. The first is a model-based approach, PLASTIC-Model, that builds models of previous teammates' behaviors and plans online to determine the best course of action. The second uses a policy-based approach, PLASTIC-Policy, in which it learns policies for cooperating with past teammates and selects from among these policies online. Furthermore, we introduce a new transfer learning algorithm, TwoStageTransfer, that allows transferring knowledge from many past teammates while considering how similar each teammate is to the current ones. We theoretically analyze the computational tractability of PLASTIC-Model in a number of scenarios with unknown teammates. Additionally, we empirically evaluate PLASTIC in three domains that cover a spread of possible settings. Our evaluations show that PLASTIC can learn to communicate with unknown teammates using a limited set of messages, coordinate with externally-created teammates that do not reason about ad hoc teams, and act intelligently in domains with continuous states and actions. Furthermore, these evaluations show that TwoStageTransfer outperforms existing transfer learning algorithms and enables PLASTIC to adapt even better to new teammates. We also identify three dimensions that we argue best describe ad hoc teamwork scenarios. We hypothesize that these dimensions are useful for analyzing similarities among domains and determining which can be tackled by similar algorithms in addition to identifying avenues for future research. The work presented in this thesis represents an important step towards enabling agents to adapt to unknown teammates in the real world. PLASTIC significantly broadens the robustness of robots to their teammates and allows them to quickly adapt to new teammates by reusing previously learned knowledge.Computer Science
    • …
    corecore