2 research outputs found

    multi-objective optimization integration of query interfaces for the deep web based on attribute constraints

    No full text
    In order to query and retrieve the rich and useful information hidden in the Deep Web efficiently, extensive research on domain-specific Deep Web Data Integration Systems (DWDIS) has been carried out in recent years. In DWDIS, large-scale automatic integration of query interfaces of domain-specific Web Databases (WDBs) remains a serious challenge due to the scale of the problem and the great diversity of the WDBs' query interfaces. To address this challenge, in this paper, we first give a definition of the constraint matrix which can accurately describe three types of constraints (hierarchical constraints, group constraints and precedence constraints) and the strengths of attributes of a query interface, and then prove that the schema tree of the query interface corresponds to only one constraint matrix, and vice versa. Furthermore, we transform the problem of integrating domain-specific query interfaces into a problem of integrating the constraint matrices and set up a multi-objective optimization problem model. To effectively solve the optimization model, some strategies to extend and merge the constraint matrices are designed. A method for automatically detecting and filtering abnormal data (noises) in the query interfaces is also proposed. More importantly, a novel and efficient algorithm applicable to large-scale automatic integration of domain-specific query interfaces is developed. Finally, the proposed algorithm is evaluated by experiments on the real query interface data set. Our theoretical analysis and experimental results show that the proposed algorithm outperforms existing state-of-the-art integration algorithms of domain-specific query interfaces. © 2013 Elsevier B.V. All rights reserved.In order to query and retrieve the rich and useful information hidden in the Deep Web efficiently, extensive research on domain-specific Deep Web Data Integration Systems (DWDIS) has been carried out in recent years. In DWDIS, large-scale automatic integration of query interfaces of domain-specific Web Databases (WDBs) remains a serious challenge due to the scale of the problem and the great diversity of the WDBs' query interfaces. To address this challenge, in this paper, we first give a definition of the constraint matrix which can accurately describe three types of constraints (hierarchical constraints, group constraints and precedence constraints) and the strengths of attributes of a query interface, and then prove that the schema tree of the query interface corresponds to only one constraint matrix, and vice versa. Furthermore, we transform the problem of integrating domain-specific query interfaces into a problem of integrating the constraint matrices and set up a multi-objective optimization problem model. To effectively solve the optimization model, some strategies to extend and merge the constraint matrices are designed. A method for automatically detecting and filtering abnormal data (noises) in the query interfaces is also proposed. More importantly, a novel and efficient algorithm applicable to large-scale automatic integration of domain-specific query interfaces is developed. Finally, the proposed algorithm is evaluated by experiments on the real query interface data set. Our theoretical analysis and experimental results show that the proposed algorithm outperforms existing state-of-the-art integration algorithms of domain-specific query interfaces. © 2013 Elsevier B.V. All rights reserved
    corecore