AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors

Beaver, Justin M.; Bridges, Robert A.; Huffer, Kelly M. T.; Iannacone, Michael. D.; Jewell, Brian; Nichols, Jeff A.; Norem, Savannah; Oesch, T. Sean; Smith, Jared M.; Spakes, Kevin; Stahl, Chelsey Dunivan; Verma, Miki E.; Watson, Cory; Weber, Brian

AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors

Authors: Justin M. Beaver
Robert A. Bridges
Kelly M. T. Huffer
Michael. D. Iannacone
Brian Jewell
Jeff A. Nichols
Savannah Norem
T. Sean Oesch
Jared M. Smith
Kevin Spakes
Chelsey Dunivan Stahl
Miki E. Verma
Cory Watson
Brian Weber
Publication date: 28 August 2023
Publisher

Abstract

This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 100K files (50/50% benign/malicious) with a stratified distribution of file types, including ~1K zero-day program executables (increasing experiment size two orders of magnitude over previous work). We present an evaluation process of delivering a file to a fresh virtual machine donning the detection technology, waiting 90s to allow static detection, then executing the file and waiting another period for dynamic detection; this allows greater fidelity in the observational data than previous experiments, in particular, resource and time-to-detection statistics. To execute all 800K trials (100K files

\times

8 tools), a software framework is designed to choreographed the experiment into a completely automated, time-synced, and reproducible workflow with substantial parallelization. A cost-benefit model was configured to integrate the tools' recall, precision, time to detection, and resource requirements into a single comparable quantity by simulating costs of use. This provides a ranking methodology for cyber competitions and a lens through which to reason about the varied statistical viewpoints of the results. These statistical and cost-model results provide insights on state of commercial malware detection

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.14835

Last time updated on 10/09/2023