20 research outputs found
Method of Extracting Is-A and Part-Of Relations Using Pattern Pairs in Mass Corpus
PACLIC 23 / City University of Hong Kong / 3-5 December 200
PARNT: A statistic based approach to extract non-taxonomic relationships of ontologies from text
Learning Non-Taxonomic Relationships is a subfield
of Ontology learning that aims at automating the
extraction of these relationships from text. This article
proposes PARNT, a novel approach that supports ontology
engineers in extracting these elements from corpora of plain
English. PARNT is parametrized, extensible and uses original
solutions that help to achieve better results when compared to
other techniques for extracting non-taxonomic relationships
from ontology concepts and English text. To evaluate the
PARNT effectiveness, a comparative experiment with another
state of the art technique was conducted.This work is supported by CNPq and CAPES, research funding agencies of the Brazilian government
Towards Terascale Knowledge Acquisition
Although vast amounts of textual data are freely available, many NLP algorithms exploit only a minute percentage of it. In this paper, we study the challenges of working at the terascale. We present an algorithm, designed for the terascale, for mining is-a relations that achieves similar performance to a state-of-the-art linguistically-rich method. We focus on the accuracy of these two systems as a function of processing time and corpus size.
Natural Language-based Approach for Helping in the Reuse of Ontology Design Patterns
Experiments in the reuse of Ontology Design Patterns (ODPs) have
revealed that users with different levels of expertise in ontology modelling face
difficulties when reusing ODPs. With the aim of tackling this problem we propose
a method and a tool for supporting a semi-automatic reuse of ODPs that
takes as input formulations in natural language (NL) of the domain aspect to be
modelled, and obtains as output a set of ODPs for solving the initial ontological
needs. The correspondence between ODPs and NL formulations is done
through Lexico-Syntactic Patterns, linguistic constructs that convey the semantic
relations present in ODPs, and which constitute the main contribution of this
paper. The main benefit of the proposed approach is the use of non-restricted
NL formulations in various languages for obtaining ODPs. The use of full NL
poses challenges in the disambiguation of linguistic expressions that we expect
to solve with user interaction, among other strategies
Evaluating techniques for learning non-taxonomic relationships of ontologies from text
"Manuscript"Learning Non-Taxonomic Relationships is a sub-field of Ontology Learning that aims
at automating the extraction of these relationships from text. Several techniques have been
proposed based on Natural Language Processing and Machine Learning. However just like for
other techniques for Ontology Learning, evaluating techniques for Learning Non-Taxonomic
Relationships is an open problem. Three general proposals suggest that the learned ontologies
can be evaluated in an executable application or by domain experts or even by a comparison
with a predefined reference ontology. This article proposes two procedures to evaluate
techniques for Learning Non-Taxonomic Relationships based on the comparison of the
relationships obtained with those of a reference ontology. Also, these procedures are used in
the evaluation of two state of the art techniques performing the extraction of relationships from
two corpora in the domains of biology and Family Law.This work is supported by CNPq, CAPES and FAPEMA, research funding agencies of the Brazilian government
Populating a Knowledge Base with Object-Location Relations Using Distributional Semantics
International audienceThe paper presents an approach to extract knowledge from large text corpora, in particular knowledge that facilitates object manipulation by embodied intelligent systems that need to act in the world. As a first step, our goal is to extract the prototypical location of given objects from text corpora. We approach this task by calculating relatedness scores for objects and locations using techniques from distributional semantics. We empirically compare different methods for representing locations and objects as vectors in some geometric space, and we evaluate them with respect to a crowd-sourced gold standard in which human subjects had to rate the prototypicality of a location given an object. By applying the proposed framework on DBpedia, we are able to build a knowledge base of 931 high confidence object-locations relations in a fully automatic fashion.