1,471 research outputs found
Dynamic adversarial mining - effectively applying machine learning in adversarial non-stationary environments.
While understanding of machine learning and data mining is still in its budding stages, the engineering applications of the same has found immense acceptance and success. Cybersecurity applications such as intrusion detection systems, spam filtering, and CAPTCHA authentication, have all begun adopting machine learning as a viable technique to deal with large scale adversarial activity. However, the naive usage of machine learning in an adversarial setting is prone to reverse engineering and evasion attacks, as most of these techniques were designed primarily for a static setting. The security domain is a dynamic landscape, with an ongoing never ending arms race between the system designer and the attackers. Any solution designed for such a domain needs to take into account an active adversary and needs to evolve over time, in the face of emerging threats. We term this as the âDynamic Adversarial Miningâ problem, and the presented work provides the foundation for this new interdisciplinary area of research, at the crossroads of Machine Learning, Cybersecurity, and Streaming Data Mining. We start with a white hat analysis of the vulnerabilities of classification systems to exploratory attack. The proposed âSeed-Explore-Exploitâ framework provides characterization and modeling of attacks, ranging from simple random evasion attacks to sophisticated reverse engineering. It is observed that, even systems having prediction accuracy close to 100%, can be easily evaded with more than 90% precision. This evasion can be performed without any information about the underlying classifier, training dataset, or the domain of application. Attacks on machine learning systems cause the data to exhibit non stationarity (i.e., the training and the testing data have different distributions). It is necessary to detect these changes in distribution, called concept drift, as they could cause the prediction performance of the model to degrade over time. However, the detection cannot overly rely on labeled data to compute performance explicitly and monitor a drop, as labeling is expensive and time consuming, and at times may not be a possibility altogether. As such, we propose the âMargin Density Drift Detection (MD3)â algorithm, which can reliably detect concept drift from unlabeled data only. MD3 provides high detection accuracy with a low false alarm rate, making it suitable for cybersecurity applications; where excessive false alarms are expensive and can lead to loss of trust in the warning system. Additionally, MD3 is designed as a classifier independent and streaming algorithm for usage in a variety of continuous never-ending learning systems. We then propose a âDynamic Adversarial Miningâ based learning framework, for learning in non-stationary and adversarial environments, which provides âsecurity by designâ. The proposed âPredict-Detectâ classifier framework, aims to provide: robustness against attacks, ease of attack detection using unlabeled data, and swift recovery from attacks. Ideas of feature hiding and obfuscation of feature importance are proposed as strategies to enhance the learning framework\u27s security. Metrics for evaluating the dynamic security of a system and recover-ability after an attack are introduced to provide a practical way of measuring efficacy of dynamic security strategies. The framework is developed as a streaming data methodology, capable of continually functioning with limited supervision and effectively responding to adversarial dynamics. The developed ideas, methodology, algorithms, and experimental analysis, aim to provide a foundation for future work in the area of âDynamic Adversarial Miningâ, wherein a holistic approach to machine learning based security is motivated
Modeling, Quantifying, and Limiting Adversary Knowledge
Users participating in online services are required to relinquish
control over potentially sensitive personal information, exposing
them to intentional or unintentional miss-use of said information by
the service providers.
Users wishing to avoid this must either abstain from often extremely
useful services, or provide false information which is usually
contrary to the terms of service they must abide by.
An attractive middle-ground alternative is to maintain control in
the hands of the users and provide a mechanism with which
information that is necessary for useful services can be queried.
Users need not trust any external party in the management of their
information but are now faced with the problem of judging when
queries by service providers should be answered or when they should
be refused due to revealing too much sensitive information.
Judging query safety is difficult.
Two queries may be benign in isolation but might reveal more than a
user is comfortable with in combination.
Additionally malicious adversaries who wish to learn more than
allowed might query in a manner that attempts to hide the flows of
sensitive information.
Finally, users cannot rely on human inspection of queries due to its
volume and the general lack of expertise.
This thesis tackles the automation of query judgment, giving the
self-reliant user a means with which to discern benign queries from
dangerous or exploitive ones.
The approach is based on explicit modeling and tracking of the
knowledge of adversaries as they learn about a user through the
queries they are allowed to observe.
The approach quantifies the absolute risk a user is exposed, taking
into account all the information that has been revealed already when
determining to answer a query.
Proposed techniques for approximate but sound probabilistic
inference are used to tackle the tractability of the approach,
letting the user tradeoff utility (in terms of the queries judged
safe) and efficiency (in terms of the expense of knowledge
tracking), while maintaining the guarantee that risk to the user is
never underestimated.
We apply the approach to settings where user data changes over time
and settings where multiple users wish to pool their data to perform
useful collaborative computations without revealing too much
information.
By addressing one of the major obstacles preventing the viability of
personal information control, this work brings the attractive
proposition closer to reality
Leveraging Sociological Models for Predictive Analytics
AbstractâThere is considerable interest in developing techniques for predicting human behavior, for instance to enable emerging contentious situations to be forecast or the nature of ongoing but âhidden â activities to be inferred. A promising approach to this problem is to identify and collect appropriate empirical data and then apply machine learning methods to these data to generate the predictions. This paper shows the performance of such learning algorithms often can be improved substantially by leveraging sociological models in their development and implementation. In particular, we demonstrate that sociologically-grounded learning algorithms outperform gold-standard methods in three important and challenging tasks: 1.) inferring the (unobserved) nature of relationships in adversarial social networks, 2.) predicting whether nascent social diffusion events will âgo viralâ, and 3.) anticipating and defending future actions of opponents in adversarial settings. Significantly, the new algorithms perform well even when there is limited data available for their training and execution. Keywordsâpredictive analysis, sociological models, social networks, empirical analysis, machine learning. I
Navigating the Landscape of Robust and Secure Artificial Intelligence: A Comprehensive Literature Review
Addressing the multidimensional nature of Artificial Intelligence assurance, this thorough survey is dedicated to elaborating on various aspects of ensuring the reliability and safety of computerized systems. It steers through the turbulent seas of model enervates, unmodelled phenomena, and security menaces to give an elaborate lit review. The review touches upon the boisterous ways of addressing these intricate mitigation strategies for model errors used in the past, the challenges of under-specification with modern ML models, and how understanding uncertainty is crucial. In addition, it evaluates the AI systemâs security basis, the emerging Adversary Machine Learning field, and its processes necessary for testing and evaluation of weaker adversarial case studies. The review of literature also looks upon the situation of DoD context, how the terrain surrounding developmental and operational testing is altering with all these shifts in culture that must be implemented if not to implement robust but secure AI implementation
Cyber Flag: A Realistic Cyberspace Training Construct
As is well understood, the rapidly unfolding challenges of cyberspace is a fundamental warfare paradigm shift revolutionizing the way future wars will be fought and won. A significant test for the Air Force (indeed any organization with a credible presence in cyberspace) will be providing a realistic training environment that fully meets this challenge. Why create another Flag level exercise? Realistic training (that which is effective, comprehensive and coordinated) is crucial to success in time of war. Red Flag provides dominant training within the air domain and now with the evolution of cyberspace, a comprehensive training environment is necessary to meet this growing and broadening threat. This Thesis builds on the Red Flag tactical training exercise in order to define a future environment that combines the air, space and cyberspace domains with specific emphasis on cyberspace capabilities and threats. Red Flag has and continues to be a great tactical training exercise; Cyber Flag would use the best practices of Red Flag (and other realistic training venues) to define a future training environment for the cyberspace domain. There is no better training than the hands-on realism associated with participation in an exercise such as Red Flag. Secretary Michael W. Wynne has a vision for dominant operations in cyberspace comparable to the Air Force\u27s global, strategic omnipresence in air and space. This bold vision requires a combination of joint coordination, skilled forces and a realistic training environment to bring them all together; Cyber Flag is the suggested vehicle for accomplishing this
Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense
The increasing instances of advanced attacks call for a new defense paradigm
that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense
paradigm. This chapter introduces three defense schemes that actively interact
with attackers to increase the attack cost and gather threat information, i.e.,
defensive deception for detection and counter-deception, feedback-driven Moving
Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber
deception, external noise, and the absent knowledge of the other players'
behaviors and goals, these schemes possess three progressive levels of
information restrictions, i.e., from the parameter uncertainty, the payoff
uncertainty, to the environmental uncertainty. To estimate the unknown and
reduce uncertainty, we adopt three different strategic learning schemes that
fit the associated information restrictions. All three learning schemes share
the same feedback structure of sensation, estimation, and actions so that the
most rewarding policies get reinforced and converge to the optimal ones in
autonomous and adaptive fashions. This work aims to shed lights on proactive
defense strategies, lay a solid foundation for strategic learning under
incomplete information, and quantify the tradeoff between the security and
costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218
Security-by-experiment: lessons from responsible deployment in cyberspace
Conceiving new technologies as social experiments is a means to discuss responsible deployment of technologies that may have unknown and potentially harmful side-effects. Thus far, the uncertain outcomes addressed in the paradigm of new technologies as social experiments have been mostly safetyrelated, meaning that potential harm is caused by the design plus accidental events in the environment. In some domains, such as cyberspace, dversarial agents (attackers)may be at least as important when it comes to undesirable effects of deployed technologies. In such cases, conditions for responsible experimentation may need to be implemented differently, as attackers behave strategically rather than probabilistically. In this contribution, we outline how adversarial aspects are already taken into account in technology deployment in the field of cyber security, and what the paradigm of new technologies as social experiments can learn from this. In particular, we show the importance of adversarial roles in social experiments with new technologies
- âŠ