8,720 research outputs found

    Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

    Get PDF
    Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

    Ranking relations using analogies in biological and information networks

    Get PDF
    Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S={A(1):B(1),A(2):B(2),…,A(N):B(N)}\mathbf{S}=\{A^{(1)}:B^{(1)},A^{(2)}:B^{(2)},\ldots,A^{(N)}:B ^{(N)}\}, measures how well other pairs A:B fit in with the set S\mathbf{S}. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S\mathbf{S}? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS321 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore