We discuss the KDD process in "data-flow" environments, where unstructured and time dependent data can be processed into various levels of structured and semantically-rich forms for analysis tasks. Using network intrusion detection as a concrete application example, we describe how to construct models that are both accurate in describing the underlying concepts, and efficient when used to analyze data in real-time. We present procedures for analyzing frequent patterns from lower level data and constructing appropriate features to formulate higher level data. The features generated from various levels of data have different computational costs (in time and space). We show that in order to minimize the time required in using the classification models in a real-time environment, we can exploit the "necessary conditions" associated with the low-cost features to determine whether some high-cost features need to be computed and the corresponding classification rules need to be checked. We have applied our tools to the problem of building network intrusion detection models. We report our experiments using the network data provided as part of the 1998 DARPA Intrusion Detection Evaluation program. We also discuss our experience in using the mined models in NFR, a real-time network intrusion detection system

Lee, Wenke

Mok, Kui W.

Stolfo, Salvatore

We discuss the KDD process in “data-flow ” environments, where unstructured and time dependent data can be processed into various levels of structured and semantically-rich forms for analysis tasks. Using network intrusion detection as a concrete application example, we describe how to construct models that are both accurate in describing the underlying concepts, and efficient when used to analyze data in real-time. We present procedures for analyzing frequent patterns from lower level data and constructing appropriate features to formulate higher level data. The features generated from various levels of data have different computational costs (in time and space). We show that in order to minimize the time required in using the classification models in a real-time environment, we can exploit the “necessary conditions ” associated with the low-cost features to determine whether some high-cost features need to be computed and the corresponding classification rules need to be checked. We have applied our tools to the problem of building network intrusion detection models. We report our experiments using the network data provided as part of the 1998 DARPA Intrusion Detection Evaluation program. We also discuss our experience in using the mined models in NFR, a real-time network intrusion detection system. 

Wenke Lee

Salvatore J. Stolfo

Kui W. Mok

CiteSeerX

Mining in a dataflow environment: Experience in network intrusion detection

Columbia University Academic Commons

Mining in a Data-flow Environment: Experience in Network Intrusion Detection

In this paper we discuss the KDD process in &quot;data-flow&quot; environments, where unstructured and time dependent data can be processed into various levels of structured and semantically-rich forms for analysis tasks. Using network intrusion detection as a concrete application example, we describe how to construct models that are both accurate in describing the underlying concepts, and efficient when used to analyze data in real-time. We present procedures for analyzing frequent patterns from lower level data and constructing appropriate features to formulate higher level data. The features generated from various levels of data have different computational costs (in time and space). We show that in order to minimize the time required in using the classification models in a real-time environment, we can exploit the &quot;necessary conditions&quot; associated with the low-cost features to determine whether some high-cost features need to be computed and the corresponding classification rules need to be..

https://academiccommons.columbia.edu/doi/10.7916/D8ZW1SQ0/download

Mining in a Data-flow Environment: Experience in Network Intrusion Detection

Abstract

Similar works

Full text

Available Versions

CiteSeerX

Columbia University Academic Commons

CiteSeerX