In order to assist security analysts in obtaining information pertaining to
their network, such as novel vulnerabilities, exploits, or patches, information
retrieval methods tailored to the security domain are needed. As labeled text
data is scarce and expensive, we follow developments in semi-supervised Natural
Language Processing and implement a bootstrapping algorithm for extracting
security entities and their relationships from text. The algorithm requires
little input data, specifically, a few relations or patterns (heuristics for
identifying relations), and incorporates an active learning component which
queries the user on the most important decisions to prevent drifting from the
desired relations. Preliminary testing on a small corpus shows promising
results, obtaining precision of .82.Comment: 4 pages in Cyber & Information Security Research Conference 2015, AC