Search CORE

107 research outputs found

The importance of cross-lingual information for matching Wikipedia with the Cyc ontology

Author: Smywiński-Pohl Aleksander
Wróbel Krzysztof
Publication venue: Technical University of Aachen
Publication date: 01/01/2014
Field of study

In this paper we try to answer the question how cross-lingual evidence may improve matching between different classification schemas. We concentrate specifcally on the task of mapping between Wikipedia categories and Cycterms as well as the classication of Wikipedia articles to the Cyctaxonomy and show how this process may be improved by consuming the evidence that is available in different editions of Wikipedia. The results show that the performance of the mapping procedure may be improved from 0.6 to 4.9 percentage points, depending on the number of external Wikipedia editions and the given task

Jagiellonian Univeristy Repository

Cross-lingual knowledge linking across wiki knowledge bases

Author: Jie Tang
Juanzi Li
Zhichun Wang
Zhigang Wang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Wikipedia becomes one of the largest knowledge bases on the Web. It has attracted 513 million page views per day in January 2012. However, one critical issue for Wikipedia is that articles in different language are very unbalanced. For example, the number of articles on Wikipedia in English has reached 3.8 million, while the number of Chinese articles is still less than half million and there are only 217 thousand cross-lingual links between articles of the two languages. On the other hand, there are more than 3.9 million Chinese Wi-ki articles on Baidu Baike and Hudong.com, two popular encyclopedias in Chinese. One important question is how to link the knowledge entries distributed in different knowledge bases. This will immensely enrich the information in the on-line knowledge bases and benefit many applications. In this paper, we study the problem of cross-lingual knowledge link-ing and present a linkage factor graph model. Features are defined according to some interesting observations. Exper-iments on the Wikipedia data set show that our approach can achieve a high precision of 85.8 % with a recall of 88.1%. The approach found 202,141 new cross-lingual links between English Wikipedia and Baidu Baike

CiteSeerX

Crossref

Mining Meaning from Wikipedia

Author: Legg Catherine
Medelyan Olena
Milne David
Witten Ian H.
Publication venue
Publication date: 01/01/2008
Field of study

Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.Comment: An extensive survey of re-using information in Wikipedia in natural language processing, information retrieval and extraction and ontology building. Accepted for publication in International Journal of Human-Computer Studie

arXiv.org e-Print Archive

CiteSeerX

Deakin Research Online

Research Commons@Waikato

Foundational Ontologies meet Ontology Matching: A Survey

Author: Guizzardi Giancarlo
Pease Adam
Schmidt Daniela
Trojahn Cassia
Vieira Renata
Publication venue: 'IOS Press'
Publication date: 01/01/2021
Field of study

Ontology matching is a research area aimed at finding ways to make different ontologies interoperable. Solutions to the problem have been proposed from different disciplines, including databases, natural language processing, and machine learning. The role of foundational ontologies for ontology matching is an important one. It is multifaceted and with room for development. This paper presents an overview of the different tasks involved in ontology matching that consider foundational ontologies. We discuss the strengths and weaknesses of existing proposals and highlight the challenges to be addressed in the future

Repositório Científico da Universidade de Évora

Knowledge harvesting from text and web sources

Author: Fabian Suchanek
Gerhard Weikum
Publication venue
Publication date: 11/04/2020
Field of study

Abstract-The proliferation of knowledge-sharing communities such as Wikipedia and the progress in scalable information extraction from Web and text sources has enabled the automatic construction of very large knowledge bases. Recent endeavors of this kind include academic research projects such as DBpedia, KnowItAll, Probase, ReadTheWeb, and YAGO, as well as industrial ones such as Freebase and Trueknowledge. These projects provide automatically constructed knowledge bases of facts about named entities, their semantic classes, and their mutual relationships. Such world knowledge in turn enables cognitive applications and knowledge-centric services like disambiguating natural-language text, deep question answering, and semantic search for entities and relations in Web and enterprise data. Prominent examples of how knowledge bases can be harnessed include the Google Knowledge Graph and the IBM Watson question answering system. This tutorial presents state-of-theart methods, recent advances, research opportunities, and open challenges along this avenue of knowledge harvesting and its applications

CiteSeerX

Recommended from our members

The Value of Everything: Ranking and Association with Encyclopedic Knowledge

Author: Coursey Kino High
Publication venue: 'University of North Texas Libraries'
Publication date: 01/12/2009
Field of study

This dissertation describes WikiRank, an unsupervised method of assigning relative values to elements of a broad coverage encyclopedic information source in order to identify those entries that may be relevant to a given piece of text. The valuation given to an entry is based not on textual similarity but instead on the links that associate entries, and an estimation of the expected frequency of visitation that would be given to each entry based on those associations in context. This estimation of relative frequency of visitation is embodied in modifications to the random walk interpretation of the PageRank algorithm. WikiRank is an effective algorithm to support natural language processing applications. It is shown to exceed the performance of previous machine learning algorithms for the task of automatic topic identification, providing results comparable to that of human annotators. Second, WikiRank is found useful for the task of recognizing text-based paraphrases on a semantic level, by comparing the distribution of attention generated by two pieces of text using the encyclopedic resource as a common reference. Finally, WikiRank is shown to have the ability to use its base of encyclopedic knowledge to recognize terms from different ontologies as describing the same thing, and thus allowing for the automatic generation of mapping links between ontologies. The conclusion of this thesis is that the "knowledge access heuristic" is valuable and that a ranking process based on a large encyclopedic resource can form the basis for an extendable general purpose mechanism capable of identifying relevant concepts by association, which in turn can be effectively utilized for enumeration and comparison at a semantic level

UNT Digital Library

A Survey on Knowledge Graphs: Representation, Acquisition and Applications

Author: Cambria Erik
Ji Shaoxiong
Marttinen Pekka
Pan Shirui
Yu Philip S.
Publication venue
Publication date: 17/01/2021
Field of study

Human knowledge provides a formal understanding of the world. Knowledge graphs that represent structural relations between entities have become an increasingly popular research direction towards cognition and human-level intelligence. In this survey, we provide a comprehensive review of knowledge graph covering overall research topics about 1) knowledge graph representation learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph, and 4) knowledge-aware applications, and summarize recent breakthroughs and perspective directions to facilitate future research. We propose a full-view categorization and new taxonomies on these topics. Knowledge graph embedding is organized from four aspects of representation space, scoring function, encoding models, and auxiliary information. For knowledge acquisition, especially knowledge graph completion, embedding methods, path inference, and logical rule reasoning, are reviewed. We further explore several emerging topics, including meta relational learning, commonsense reasoning, and temporal knowledge graphs. To facilitate future research on knowledge graphs, we also provide a curated collection of datasets and open-source libraries on different tasks. In the end, we have a thorough outlook on several promising research directions

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

Aaltodoc Publication Archive

Semantic Knowledge Graphs for the News: A Review

Author: Al-Moslmi Tareq
Dang Nguyen Duc Tien
Gallofré Ocaña Marc
Opdahl Andreas Lothe
Tessem Bjørnar
Veres Csaba
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

ICT platforms for news production, distribution, and consumption must exploit the ever-growing availability of digital data. These data originate from different sources and in different formats; they arrive at different velocities and in different volumes. Semantic knowledge graphs (KGs) is an established technique for integrating such heterogeneous information. It is therefore well-aligned with the needs of news producers and distributors, and it is likely to become increasingly important for the news industry. This article reviews the research on using semantic knowledge graphs for production, distribution, and consumption of news. The purpose is to present an overview of the field; to investigate what it means; and to suggest opportunities and needs for further research and development.publishedVersio

University of Bergen