4 research outputs found

    Multi-column substring matching for database schema translation

    No full text
    We describe a method for discovering complex schema translations involving substrings from multiple database columns. The method does not require a training set of instances linked across databases and it is capable of dealing with both fixed- and variable-length field columns. We propose an iterative algorithm that deduces the correct sequence of concatenations of column substrings in order to translate from one database to another. We introduce the algorithm along with examples on common database data values and examine its performance on real-world and synthetic datasets. 1

    Engineering truly automated data integration and translation systems

    Get PDF
    This thesis presents an automated, data-driven integration process for relational databases. Whereas previous integration methods assumed a large amount of user involvement as well as the availability of database meta-data, we make no use of meta-data and little end user input. This is done using a novel join and translation finding algorithm that searches for the proper key / foreign key relationships while inferring the instance transformations from one database to another. Because we rely only on the relations that bind the attributes together, we make no use of the database schema information. A novel searching method allows us to search the database for relevant objects without requiring server side indexes or cooperative databases
    corecore