4 research outputs found

    Restructuring databases for knowledge discovery by consolidation and link formation

    Get PDF
    Databases often inaccurately identify entities of interest. Two operations, consolidation and link formation, which complement the usual machine learning techniques that use similarity-based clustering to discover classifications, are proposed as essential components of KDD systems for certain applications. Consolidation relates identifiers present in a database to a set of real world entities (RWE’s) which are not uniquely identified in the database. Consolidation may also be viewed as a transformation of representation from the identifiers present in the original database to the RWE’s. Link formation constructs structured relationships between consolidated RWE’s through identifiers and events explicitly represented in the database. Consolidation and link formation are easily implemented as index creation in relational database management systems. An operational knowledge discovery system identifies potential money laundering in a database of large cash transactions using consolidation and link formation

    Distinguishing the Unexplainable from the Merely Unusual: Adding Explanations to Outliers to Discover and Detect Significant Complex Rare Events

    No full text
    ABSTRACT This paper discusses the key role of explanations for applications that discover and detect significant complex rare events. These events are distinguished not necessarily by outliers (i.e., unusual or rare data values), but rather by their inexplicability in terms of appropriate real-world behaviors. Outlier detection techniques are typically part of such applications and may provide useful starting points; however, they are far from sufficient for identifying events of interest and discriminating them from similar but uninteresting events to a degree necessary for operational utility. Other techniques that distinguish anomalies from outliers, and then enable anomalies to be classified as relevant or not to the particular detection problem are also necessary. We argue that explanations are the key to the effectiveness of such complex rare event detection applications, and illustrate this point with examples from several real applications
    corecore