4 research outputs found
Dos and Don'ts of Machine Learning in Computer Security
With the growing processing power of computing systems and the increasing
availability of massive datasets, machine learning algorithms have led to major
breakthroughs in many different areas. This development has influenced computer
security, spawning a series of work on learning-based security systems, such as
for malware detection, vulnerability discovery, and binary code analysis.
Despite great potential, machine learning in security is prone to subtle
pitfalls that undermine its performance and render learning-based systems
potentially unsuitable for security tasks and practical deployment. In this
paper, we look at this problem with critical eyes. First, we identify common
pitfalls in the design, implementation, and evaluation of learning-based
security systems. We conduct a study of 30 papers from top-tier security
conferences within the past 10 years, confirming that these pitfalls are
widespread in the current security literature. In an empirical analysis, we
further demonstrate how individual pitfalls can lead to unrealistic performance
and interpretations, obstructing the understanding of the security problem at
hand. As a remedy, we propose actionable recommendations to support researchers
in avoiding or mitigating the pitfalls where possible. Furthermore, we identify
open problems when applying machine learning in security and provide directions
for further research.Comment: to appear at USENIX Security Symposium 202