79 research outputs found

    Efficient Ranked Keyword Using AML

    Get PDF
    Entity Recognition is process of identifying predefined entities such as person names, products, or locations in a given docu ment. This is done by finding all possible substrings from a document that match any reference in the given entity dictionary. Approximate Membership Extraction (AME) method was used for finding all substrings in a given document that can approximately match any c lean references but it generates many redundant matched substrings because of approximation (rough calculation), thus rendering AME is not suitable for real - world tasks based on entity extraction. We propose a web - based join framework which combines a web search along with the approximate membership localization. Our process first provides a top n number of documents fetched from the web using a general search using the given query and then approximate membership localization(AML) is applied on these documents using the clear reference table and extra cts the entities form the document to form the intermediate reference table using Edit distance Vector, Score Correlation

    Name Disambiguation from link data in a collaboration graph using temporal and topological features

    Get PDF
    In a social community, multiple persons may share the same name, phone number or some other identifying attributes. This, along with other phenomena, such as name abbreviation, name misspelling, and human error leads to erroneous aggregation of records of multiple persons under a single reference. Such mistakes affect the performance of document retrieval, web search, database integration, and more importantly, improper attribution of credit (or blame). The task of entity disambiguation partitions the records belonging to multiple persons with the objective that each decomposed partition is composed of records of a unique person. Existing solutions to this task use either biographical attributes, or auxiliary features that are collected from external sources, such as Wikipedia. However, for many scenarios, such auxiliary features are not available, or they are costly to obtain. Besides, the attempt of collecting biographical or external data sustains the risk of privacy violation. In this work, we propose a method for solving entity disambiguation task from link information obtained from a collaboration network. Our method is non-intrusive of privacy as it uses only the time-stamped graph topology of an anonymized network. Experimental results on two real-life academic collaboration networks show that the proposed method has satisfactory performance.Comment: The short version of this paper has been accepted to ASONAM 201

    Entity Identity Reconciliation based Big Data Federation A MDE approach

    Get PDF
    “Information is power” is a sentence attributed to Francis Bacon that acquired a high important in the current era of the information. However, too much information can be a negative aspect. The term of “Infoxication” refers to the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information. With the increasing of relevance of open data and big database, the application of mechanisms and solutions to manage information is critical. This paper introduces the problem of unique identification and data reconciliation and offers a discussion about how to solve this problem in big and open data environment. The problem of data reconciliation in multiple databases and the unique identification of entities is not a new problem, but, how effective are classical mechanisms in the new internet environment? In this paper a solution based on model-driven engineering and virtual graph is presented in order to improve the processing of information in big open repositories. The paper illustrates the idea with a real example for the right exploitation of heritage information in the south of Spain.Ministerio de Ciencia e Innovación TIN2013-46928-C3-3-

    Handling instance coreferencing in the KnoFuss architecture

    Get PDF
    Finding RDF individuals that refer to the same real-world entities but have different URIs is necessary for the efficient use of data across sources. The requirements for such instance-level integration of RDF data are different from both database record linkage and ontology schema matching scenarios. Flexible configuration and reuse of different methods is needed to achieve good performance. Our data integration architecture, called KnoFuss, implements a component-based approach, which allows flexible selection and tuning of methods and takes the ontological schemata into account to improve the reusability of methods

    Entity Identity Reconciliation based Big Data Federation-A MDE approach

    Get PDF
    “Information is power” is a sentence attributed to Francis Bacon that acquired a high important in the current era of the information. However, too much information can be a negative aspect. The term of “Infoxication” refers to the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information. With the increasing of relevance of open data and big database, the application of mechanisms and solutions to manage information is critical. This paper introduces the problem of unique identification and data reconciliation and offers a discussion about how to solve this problem in big and open data environment. The problem of data reconciliation in multiple databases and the unique identification of entities is not a new problem, but, how effective are classical mechanisms in the new internet environment? In this paper a solution based on model-driven engineering and virtual graph is presented in order to improve the processing of information in big open repositories. The paper illustrates the idea with a real example for the right exploitation of heritage information in the south of Spain
    • …
    corecore