Cybersecurity Information Exchange with Privacy (CYBEX-P) and TAHOE – A Cyberthreat Language

Abstract

Cybersecurity information sharing (CIS) is envisioned to protect organizations more effectively from advanced cyberattacks. However, a completely automated CIS platform is not widely adopted. The major challenges are: (1) the absence of advanced data analytics capabilities and (2) the absence of a robust cyberthreat language (CTL). This work introduces Cybersecurity Information Exchange with Privacy (CYBEX-P), as a CIS framework, to tackle these challenges. CYBEX-P allows organizations to share heterogeneous data from various sources. It correlates the data to automatically generate intuitive reports and defensive rules. To achieve such versatility, we have developed TAHOE - a graph-based CTL. TAHOE is a structure for storing, sharing, and analyzing threat data. It also intrinsically correlates the data. We have further developed a universal Threat Data Query Language (TDQL). In this work, we propose the system architecture for CYBEX-P. We then discuss its scalability along with a protocol to correlate attributes of threat data. We further introduce TAHOE & TDQL as better alternatives to existing CTLs and formulate ThreatRank - an algorithm to detect new malicious events.We have developed CYBEX-P as a complete CIS platform for not only data sharing but also for advanced threat data analysis. To that end, we have developed two frameworks that use CYBEX-P infrastructure as a service (IaaS). The first work is a phishing URL detector that uses machine learning to detect new phishing URLs. This real-time system adapts to the ever-changing landscape of phishing URLs and maintains an accuracy of 86%. The second work models attacker behavior in a botnet. It combines heterogeneous threat data and analyses them together to predict the behavior of an attacker in a host infected by a bot malware. We have achieved a prediction accuracy of 85-97% using our methodology. These two frameworks establish the feasibility of CYBEX-P for advanced threat data analysis for future researchers

    Similar works