196,246 research outputs found
IDPS Signature Classification with a Reject Option and the Incorporation of Expert Knowledge
As the importance of intrusion detection and prevention systems (IDPSs)
increases, great costs are incurred to manage the signatures that are generated
by malicious communication pattern files. Experts in network security need to
classify signatures by importance for an IDPS to work. We propose and evaluate
a machine learning signature classification model with a reject option (RO) to
reduce the cost of setting up an IDPS. To train the proposed model, it is
essential to design features that are effective for signature classification.
Experts classify signatures with predefined if-then rules. An if-then rule
returns a label of low, medium, high, or unknown importance based on keyword
matching of the elements in the signature. Therefore, we first design two types
of features, symbolic features (SFs) and keyword features (KFs), which are used
in keyword matching for the if-then rules. Next, we design web information and
message features (WMFs) to capture the properties of signatures that do not
match the if-then rules. The WMFs are extracted as term frequency-inverse
document frequency (TF-IDF) features of the message text in the signatures. The
features are obtained by web scraping from the referenced external attack
identification systems described in the signature. Because failure needs to be
minimized in the classification of IDPS signatures, as in the medical field, we
consider introducing a RO in our proposed model. The effectiveness of the
proposed classification model is evaluated in experiments with two real
datasets composed of signatures labeled by experts: a dataset that can be
classified with if-then rules and a dataset with elements that do not match an
if-then rule. In the experiment, the proposed model is evaluated. In both
cases, the combined SFs and WMFs performed better than the combined SFs and
KFs. In addition, we also performed feature analysis.Comment: 9 pages, 5 figures, 3 table
Automated user modeling for personalized digital libraries
Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to
improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in
an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information
Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art
Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover
- …