564 research outputs found
On Type-Aware Entity Retrieval
Today, the practice of returning entities from a knowledge base in response
to search queries has become widespread. One of the distinctive characteristics
of entities is that they are typed, i.e., assigned to some hierarchically
organized type system (type taxonomy). The primary objective of this paper is
to gain a better understanding of how entity type information can be utilized
in entity retrieval. We perform this investigation in an idealized "oracle"
setting, assuming that we know the distribution of target types of the relevant
entities for a given query. We perform a thorough analysis of three main
aspects: (i) the choice of type taxonomy, (ii) the representation of
hierarchical type information, and (iii) the combination of type-based and
term-based similarity in the retrieval model. Using a standard entity search
test collection based on DBpedia, we find that type information proves most
useful when using large type taxonomies that provide very specific types. We
provide further insights on the extensional coverage of entities and on the
utility of target types.Comment: Proceedings of the 3rd ACM International Conference on the Theory of
Information Retrieval (ICTIR '17), 201
Linking Data Across Universities: An Integrated Video Lectures Dataset
This paper presents our work and experience interlinking educational information across universities through the use of Linked Data principles and technologies. More specifically this paper is focused on selecting, extracting, structuring and interlinking information of video lectures produced by 27 different educational institutions. For this purpose, selected information from several websites and YouTube channels have been scraped and structured according to well-known vocabularies, like FOAF 1, or the W3C Ontology for Media Resources 2. To integrate this information, the extracted videos have been categorized under a common classification space, the taxonomy defined by the Open Directory Project 3. An evaluation of this categorization process has been conducted obtaining a 98% degree of coverage and 89% degree of correctness. As a result of this process a new Linked Data dataset has been released containing more than 14,000 video lectures from 27 different institutions and categorized under a common classification scheme
Mining Meaning from Wikipedia
Wikipedia is a goldmine of information; not just for its many readers, but
also for the growing community of researchers who recognize it as a resource of
exceptional scale and utility. It represents a vast investment of manual effort
and judgment: a huge, constantly evolving tapestry of concepts and relations
that is being applied to a host of tasks.
This article provides a comprehensive description of this work. It focuses on
research that extracts and makes use of the concepts, relations, facts and
descriptions found in Wikipedia, and organizes the work into four broad
categories: applying Wikipedia to natural language processing; using it to
facilitate information retrieval and information extraction; and as a resource
for ontology building. The article addresses how Wikipedia is being used as is,
how it is being improved and adapted, and how it is being combined with other
structures to create entirely new resources. We identify the research groups
and individuals involved, and how their work has developed in the last few
years. We provide a comprehensive list of the open-source software they have
produced.Comment: An extensive survey of re-using information in Wikipedia in natural
language processing, information retrieval and extraction and ontology
building. Accepted for publication in International Journal of Human-Computer
Studie
Enabling Keyword Search on Linked Data Repositories: An Ontology-Based Approach
The Web is experiencing a continuous change that is leading to the realization of the Semantic Web. Initiatives such as Linked Data have made a huge amount of structured information publicly available, encouraging the rest of the Internet community to tag their resources with it. Unfortunately, the amount of interlinked domains and information is so big that handling it eÂżciently has become really diÂżcult for Âżnal users. Thus, we have to provide them with tools to search the needed resources in an easy way. In this paper, we propose an approach to provide users with diÂżerent domain views on a general data repository, enabling them to perform both keyword and reÂżnement searches. Our system exploits the knowledge stored in ontologies to 1) perform eÂżcient keyword searches over a speciÂżed domain, and 2) reÂżne the userâs domain searches. In this way, we enable the deÂżnition of diÂżerent semantic views on Linked Data datasets without having to change the original semantics. We present a prototype of our approach that focuses on the case of DBpedia, which provides a semantic way to access to Wikipedia
From logical forms to SPARQL query with GETARUNS
We present a system for Question Answering which computes a
prospective answer from Logical Forms produced by a full-fledged NLP for
text understanding, and then maps the result onto schemata in SPARQL to be
used for accessing the Semantic Web. As an intermediate step, and whenever
there are complex concepts to be mapped, the system looks for a corresponding
amalgam in YAGO classes. It is just by the internal structure of the Logical
Form that we are able to produce a suitable and meaningful context for concept
disambiguation. Logical Forms are the final output of a complex system for text
understanding - GETARUNS - which can deal with different levels of syntactic
and semantic ambiguity in the generation of a final structure, by accessing
computational lexical equipped with sub-categorization frames and appropriate
selectional restrictions applied to the attachment of complements and adjuncts.
The system also produces pronominal binding and instantiates the implicit
arguments, if needed, in order to complete the required Predicate Argument
structure which is licensed by the semantic component
- âŠ