Constructing a vulnerability knowledge graph

Abstract

Attackers exploiting vulnerabilities in software can cause severe damage to affected victims. Despite continuous efforts of security experts, the number of reported vulnerabilities is increasing. As of January 2022, the National Vulnerability Database consists of more than 160 000 vulnerability records of known vulnerabilities. These vulnerability records contain data such as vulnerability classification, severity metrics, affected software products, and textual descriptions describing the vulnerability. The National Vulnerability Database provides a high-quality data source for security analysts learning about known vulnerabilities. However, maintaining this database comes at a high labor cost for the security experts involved. Knowledge graphs is a semantic technology which has the potential to aid in this task. In our work we explore how knowledge graphs are used in the broader field of cyber security. We then propose our own vulnerability knowledge graph for vulnerability assessment where we combine techniques from NLP with Knowledge graph embedding. Although future work on constructing ground truth data is necessary to evaluate and benchmark our experiments, our initial results show entity prediction results of 0.76 in Hits@10 score

    Similar works