4 research outputs found

    Optimal Timing in Dynamic and Robust Attacker Engagement During Advanced Persistent Threats

    Full text link
    Advanced persistent threats (APTs) are stealthy attacks which make use of social engineering and deception to give adversaries insider access to networked systems. Against APTs, active defense technologies aim to create and exploit information asymmetry for defenders. In this paper, we study a scenario in which a powerful defender uses honeynets for active defense in order to observe an attacker who has penetrated the network. Rather than immediately eject the attacker, the defender may elect to gather information. We introduce an undiscounted, infinite-horizon Markov decision process on a continuous state space in order to model the defender's problem. We find a threshold of information that the defender should gather about the attacker before ejecting him. Then we study the robustness of this policy using a Stackelberg game. Finally, we simulate the policy for a conceptual network. Our results provide a quantitative foundation for studying optimal timing for attacker engagement in network defense.Comment: Submitted to the 2019 Intl. Symp. Modeling and Optimization in Mobile, Ad Hoc, and Wireless Nets. (WiOpt

    Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes

    Full text link
    A honeynet is a promising active cyber defense mechanism. It reveals the fundamental Indicators of Compromise (IoCs) by luring attackers to conduct adversarial behaviors in a controlled and monitored environment. The active interaction at the honeynet brings a high reward but also introduces high implementation costs and risks of adversarial honeynet exploitation. In this work, we apply infinite-horizon Semi-Markov Decision Process (SMDP) to characterize a stochastic transition and sojourn time of attackers in the honeynet and quantify the reward-risk trade-off. In particular, we design adaptive long-term engagement policies shown to be risk-averse, cost-effective, and time-efficient. Numerical results have demonstrated that our adaptive engagement policies can quickly attract attackers to the target honeypot and engage them for a sufficiently long period to obtain worthy threat information. Meanwhile, the penetration probability is kept at a low level. The results show that the expected utility is robust against attackers of a large range of persistence and intelligence. Finally, we apply reinforcement learning to the SMDP to solve the curse of modeling. Under a prudent choice of the learning rate and exploration policy, we achieve a quick and robust convergence of the optimal policy and value.Comment: The presentation can be found at https://youtu.be/GPKT3uJtXqk. arXiv admin note: text overlap with arXiv:1907.0139

    Farsighted Risk Mitigation of Lateral Movement Using Dynamic Cognitive Honeypots

    Full text link
    Lateral movement of advanced persistent threats has posed a severe security challenge. Due to the stealthy and persistent nature of the lateral movement, defenders need to consider time and spatial locations holistically to discover latent attack paths across a large time-scale and achieve long-term security for the target assets. In this work, we propose a time-expanded random network to model the stochastic service links in the user-host enterprise network and the adversarial lateral movement. We design cognitive honeypots at idle production nodes and disguise honey links as service links to detect and deter the adversarial lateral movement. The location of the honeypot changes randomly at different times and increases the honeypots' stealthiness. Since the defender does not know whether, when, and where the initial intrusion and the lateral movement occur, the honeypot policy aims to reduce the target assets' Long-Term Vulnerability (LTV) for proactive and persistent protection. We further characterize three tradeoffs, i.e., the probability of interference, the stealthiness level, and the roaming cost. To counter the curse of multiple attack paths, we propose an iterative algorithm and approximate the LTV with the union bound for computationally efficient deployment of cognitive honeypots. The results of the vulnerability analysis illustrate the bounds, trends, and a residue of LTV when the adversarial lateral movement has infinite duration. Besides honeypot policies, we obtain a critical threshold of compromisability to guide the design and modification of the current system parameters for a higher level of long-term security. We show that the target node can achieve zero vulnerability under infinite stages of lateral movement if the probability of movement deterrence is not less than the threshold

    Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

    Full text link
    The increasing instances of advanced attacks call for a new defense paradigm that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense paradigm. This chapter introduces three defense schemes that actively interact with attackers to increase the attack cost and gather threat information, i.e., defensive deception for detection and counter-deception, feedback-driven Moving Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber deception, external noise, and the absent knowledge of the other players' behaviors and goals, these schemes possess three progressive levels of information restrictions, i.e., from the parameter uncertainty, the payoff uncertainty, to the environmental uncertainty. To estimate the unknown and reduce uncertainty, we adopt three different strategic learning schemes that fit the associated information restrictions. All three learning schemes share the same feedback structure of sensation, estimation, and actions so that the most rewarding policies get reinforced and converge to the optimal ones in autonomous and adaptive fashions. This work aims to shed lights on proactive defense strategies, lay a solid foundation for strategic learning under incomplete information, and quantify the tradeoff between the security and costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218
    corecore