Cyber threat attribution is the process of identifying the actor of an attack
incident in cyberspace. An accurate and timely threat attribution plays an
important role in deterring future attacks by applying appropriate and timely
defense mechanisms. Manual analysis of attack patterns gathered by honeypot
deployments, intrusion detection systems, firewalls, and via trace-back
procedures is still the preferred method of security analysts for cyber threat
attribution. Such attack patterns are low-level Indicators of Compromise (IOC).
They represent Tactics, Techniques, Procedures (TTP), and software tools used
by the adversaries in their campaigns. The adversaries rarely re-use them. They
can also be manipulated, resulting in false and unfair attribution. To
empirically evaluate and compare the effectiveness of both kinds of IOC, there
are two problems that need to be addressed. The first problem is that in recent
research works, the ineffectiveness of low-level IOC for cyber threat
attribution has been discussed intuitively. An empirical evaluation for the
measure of the effectiveness of low-level IOC based on a real-world dataset is
missing. The second problem is that the available dataset for high-level IOC
has a single instance for each predictive class label that cannot be used
directly for training machine learning models. To address these problems in
this research work, we empirically evaluate the effectiveness of low-level IOC
based on a real-world dataset that is specifically built for comparative
analysis with high-level IOC. The experimental results show that the high-level
IOC trained models effectively attribute cyberattacks with an accuracy of 95%
as compared to the low-level IOC trained models where accuracy is 40%.Comment: 20 page