1,471 research outputs found

    Dynamic adversarial mining - effectively applying machine learning in adversarial non-stationary environments.

    Get PDF
    While understanding of machine learning and data mining is still in its budding stages, the engineering applications of the same has found immense acceptance and success. Cybersecurity applications such as intrusion detection systems, spam filtering, and CAPTCHA authentication, have all begun adopting machine learning as a viable technique to deal with large scale adversarial activity. However, the naive usage of machine learning in an adversarial setting is prone to reverse engineering and evasion attacks, as most of these techniques were designed primarily for a static setting. The security domain is a dynamic landscape, with an ongoing never ending arms race between the system designer and the attackers. Any solution designed for such a domain needs to take into account an active adversary and needs to evolve over time, in the face of emerging threats. We term this as the ‘Dynamic Adversarial Mining’ problem, and the presented work provides the foundation for this new interdisciplinary area of research, at the crossroads of Machine Learning, Cybersecurity, and Streaming Data Mining. We start with a white hat analysis of the vulnerabilities of classification systems to exploratory attack. The proposed ‘Seed-Explore-Exploit’ framework provides characterization and modeling of attacks, ranging from simple random evasion attacks to sophisticated reverse engineering. It is observed that, even systems having prediction accuracy close to 100%, can be easily evaded with more than 90% precision. This evasion can be performed without any information about the underlying classifier, training dataset, or the domain of application. Attacks on machine learning systems cause the data to exhibit non stationarity (i.e., the training and the testing data have different distributions). It is necessary to detect these changes in distribution, called concept drift, as they could cause the prediction performance of the model to degrade over time. However, the detection cannot overly rely on labeled data to compute performance explicitly and monitor a drop, as labeling is expensive and time consuming, and at times may not be a possibility altogether. As such, we propose the ‘Margin Density Drift Detection (MD3)’ algorithm, which can reliably detect concept drift from unlabeled data only. MD3 provides high detection accuracy with a low false alarm rate, making it suitable for cybersecurity applications; where excessive false alarms are expensive and can lead to loss of trust in the warning system. Additionally, MD3 is designed as a classifier independent and streaming algorithm for usage in a variety of continuous never-ending learning systems. We then propose a ‘Dynamic Adversarial Mining’ based learning framework, for learning in non-stationary and adversarial environments, which provides ‘security by design’. The proposed ‘Predict-Detect’ classifier framework, aims to provide: robustness against attacks, ease of attack detection using unlabeled data, and swift recovery from attacks. Ideas of feature hiding and obfuscation of feature importance are proposed as strategies to enhance the learning framework\u27s security. Metrics for evaluating the dynamic security of a system and recover-ability after an attack are introduced to provide a practical way of measuring efficacy of dynamic security strategies. The framework is developed as a streaming data methodology, capable of continually functioning with limited supervision and effectively responding to adversarial dynamics. The developed ideas, methodology, algorithms, and experimental analysis, aim to provide a foundation for future work in the area of ‘Dynamic Adversarial Mining’, wherein a holistic approach to machine learning based security is motivated

    Modeling, Quantifying, and Limiting Adversary Knowledge

    Get PDF
    Users participating in online services are required to relinquish control over potentially sensitive personal information, exposing them to intentional or unintentional miss-use of said information by the service providers. Users wishing to avoid this must either abstain from often extremely useful services, or provide false information which is usually contrary to the terms of service they must abide by. An attractive middle-ground alternative is to maintain control in the hands of the users and provide a mechanism with which information that is necessary for useful services can be queried. Users need not trust any external party in the management of their information but are now faced with the problem of judging when queries by service providers should be answered or when they should be refused due to revealing too much sensitive information. Judging query safety is difficult. Two queries may be benign in isolation but might reveal more than a user is comfortable with in combination. Additionally malicious adversaries who wish to learn more than allowed might query in a manner that attempts to hide the flows of sensitive information. Finally, users cannot rely on human inspection of queries due to its volume and the general lack of expertise. This thesis tackles the automation of query judgment, giving the self-reliant user a means with which to discern benign queries from dangerous or exploitive ones. The approach is based on explicit modeling and tracking of the knowledge of adversaries as they learn about a user through the queries they are allowed to observe. The approach quantifies the absolute risk a user is exposed, taking into account all the information that has been revealed already when determining to answer a query. Proposed techniques for approximate but sound probabilistic inference are used to tackle the tractability of the approach, letting the user tradeoff utility (in terms of the queries judged safe) and efficiency (in terms of the expense of knowledge tracking), while maintaining the guarantee that risk to the user is never underestimated. We apply the approach to settings where user data changes over time and settings where multiple users wish to pool their data to perform useful collaborative computations without revealing too much information. By addressing one of the major obstacles preventing the viability of personal information control, this work brings the attractive proposition closer to reality

    Leveraging Sociological Models for Predictive Analytics

    Get PDF
    Abstract—There is considerable interest in developing techniques for predicting human behavior, for instance to enable emerging contentious situations to be forecast or the nature of ongoing but “hidden ” activities to be inferred. A promising approach to this problem is to identify and collect appropriate empirical data and then apply machine learning methods to these data to generate the predictions. This paper shows the performance of such learning algorithms often can be improved substantially by leveraging sociological models in their development and implementation. In particular, we demonstrate that sociologically-grounded learning algorithms outperform gold-standard methods in three important and challenging tasks: 1.) inferring the (unobserved) nature of relationships in adversarial social networks, 2.) predicting whether nascent social diffusion events will “go viral”, and 3.) anticipating and defending future actions of opponents in adversarial settings. Significantly, the new algorithms perform well even when there is limited data available for their training and execution. Keywords—predictive analysis, sociological models, social networks, empirical analysis, machine learning. I

    Navigating the Landscape of Robust and Secure Artificial Intelligence: A Comprehensive Literature Review

    Get PDF
    Addressing the multidimensional nature of Artificial Intelligence assurance, this thorough survey is dedicated to elaborating on various aspects of ensuring the reliability and safety of computerized systems. It steers through the turbulent seas of model enervates, unmodelled phenomena, and security menaces to give an elaborate lit review. The review touches upon the boisterous ways of addressing these intricate mitigation strategies for model errors used in the past, the challenges of under-specification with modern ML models, and how understanding uncertainty is crucial. In addition, it evaluates the AI system’s security basis, the emerging Adversary Machine Learning field, and its processes necessary for testing and evaluation of weaker adversarial case studies. The review of literature also looks upon the situation of DoD context, how the terrain surrounding developmental and operational testing is altering with all these shifts in culture that must be implemented if not to implement robust but secure AI implementation

    Cyber Flag: A Realistic Cyberspace Training Construct

    Get PDF
    As is well understood, the rapidly unfolding challenges of cyberspace is a fundamental warfare paradigm shift revolutionizing the way future wars will be fought and won. A significant test for the Air Force (indeed any organization with a credible presence in cyberspace) will be providing a realistic training environment that fully meets this challenge. Why create another Flag level exercise? Realistic training (that which is effective, comprehensive and coordinated) is crucial to success in time of war. Red Flag provides dominant training within the air domain and now with the evolution of cyberspace, a comprehensive training environment is necessary to meet this growing and broadening threat. This Thesis builds on the Red Flag tactical training exercise in order to define a future environment that combines the air, space and cyberspace domains with specific emphasis on cyberspace capabilities and threats. Red Flag has and continues to be a great tactical training exercise; Cyber Flag would use the best practices of Red Flag (and other realistic training venues) to define a future training environment for the cyberspace domain. There is no better training than the hands-on realism associated with participation in an exercise such as Red Flag. Secretary Michael W. Wynne has a vision for dominant operations in cyberspace comparable to the Air Force\u27s global, strategic omnipresence in air and space. This bold vision requires a combination of joint coordination, skilled forces and a realistic training environment to bring them all together; Cyber Flag is the suggested vehicle for accomplishing this

    Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

    Full text link
    The increasing instances of advanced attacks call for a new defense paradigm that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense paradigm. This chapter introduces three defense schemes that actively interact with attackers to increase the attack cost and gather threat information, i.e., defensive deception for detection and counter-deception, feedback-driven Moving Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber deception, external noise, and the absent knowledge of the other players' behaviors and goals, these schemes possess three progressive levels of information restrictions, i.e., from the parameter uncertainty, the payoff uncertainty, to the environmental uncertainty. To estimate the unknown and reduce uncertainty, we adopt three different strategic learning schemes that fit the associated information restrictions. All three learning schemes share the same feedback structure of sensation, estimation, and actions so that the most rewarding policies get reinforced and converge to the optimal ones in autonomous and adaptive fashions. This work aims to shed lights on proactive defense strategies, lay a solid foundation for strategic learning under incomplete information, and quantify the tradeoff between the security and costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218

    Security-by-experiment: lessons from responsible deployment in cyberspace

    Get PDF
    Conceiving new technologies as social experiments is a means to discuss responsible deployment of technologies that may have unknown and potentially harmful side-effects. Thus far, the uncertain outcomes addressed in the paradigm of new technologies as social experiments have been mostly safetyrelated, meaning that potential harm is caused by the design plus accidental events in the environment. In some domains, such as cyberspace, dversarial agents (attackers)may be at least as important when it comes to undesirable effects of deployed technologies. In such cases, conditions for responsible experimentation may need to be implemented differently, as attackers behave strategically rather than probabilistically. In this contribution, we outline how adversarial aspects are already taken into account in technology deployment in the field of cyber security, and what the paradigm of new technologies as social experiments can learn from this. In particular, we show the importance of adversarial roles in social experiments with new technologies
    • 

    corecore