research

Restructuring databases for knowledge discovery by consolidation and link formation

Abstract

Databases often inaccurately identify entities of interest. Two operations, consolidation and link formation, which complement the usual machine learning techniques that use similarity-based clustering to discover classifications, are proposed as essential components of KDD systems for certain applications. Consolidation relates identifiers present in a database to a set of real world entities (RWE’s) which are not uniquely identified in the database. Consolidation may also be viewed as a transformation of representation from the identifiers present in the original database to the RWE’s. Link formation constructs structured relationships between consolidated RWE’s through identifiers and events explicitly represented in the database. Consolidation and link formation are easily implemented as index creation in relational database management systems. An operational knowledge discovery system identifies potential money laundering in a database of large cash transactions using consolidation and link formation

    Similar works