353 research outputs found
Rule-based Chunking and Reusability
In this paper we discuss a rule-based approach to chunking implemented using the LT-XML2 and LT-TTT2 tools. We describe the tools and the pipeline and grammars that have been developed for the task of chunking. We show that our rule-based approach is easy to adapt to different chunking styles and that the mark-up of further linguistic information such as nominal and verbal heads can be added to the rules at little extra cost. We evaluate our chunker against the CoNLL 2000 data and discuss discrepancies between our output and the CoNLL mark-up as well as discrepancies within the CoNLL data itself. We contrast our results with the higher scores obtained using machine learning and argue that the portability and flexibility of our approach still make it a more practical solution. 1
EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria.
Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria
Sciunits: Reusable Research Objects
Science is conducted collaboratively, often requiring knowledge sharing about
computational experiments. When experiments include only datasets, they can be
shared using Uniform Resource Identifiers (URIs) or Digital Object Identifiers
(DOIs). An experiment, however, seldom includes only datasets, but more often
includes software, its past execution, provenance, and associated
documentation. The Research Object has recently emerged as a comprehensive and
systematic method for aggregation and identification of diverse elements of
computational experiments. While a necessary method, mere aggregation is not
sufficient for the sharing of computational experiments. Other users must be
able to easily recompute on these shared research objects. In this paper, we
present the sciunit, a reusable research object in which aggregated content is
recomputable. We describe a Git-like client that efficiently creates, stores,
and repeats sciunits. We show through analysis that sciunits repeat
computational experiments with minimal storage and processing overhead.
Finally, we provide an overview of sharing and reproducible cyberinfrastructure
based on sciunits gaining adoption in the domain of geosciences
PARNT: A statistic based approach to extract non-taxonomic relationships of ontologies from text
Learning Non-Taxonomic Relationships is a subfield
of Ontology learning that aims at automating the
extraction of these relationships from text. This article
proposes PARNT, a novel approach that supports ontology
engineers in extracting these elements from corpora of plain
English. PARNT is parametrized, extensible and uses original
solutions that help to achieve better results when compared to
other techniques for extracting non-taxonomic relationships
from ontology concepts and English text. To evaluate the
PARNT effectiveness, a comparative experiment with another
state of the art technique was conducted.This work is supported by CNPq and CAPES, research funding agencies of the Brazilian government
The problem of learning non-taxonomic relationships of ontologies from text
Manual construction of ontologies by domain experts and knowledge engineers is a costly task. Thus, automatic and/or semi-automatic approaches to their development are needed. Ontology Learning aims at identifying its constituent elements, such as non-taxonomic relationships, from textual information sources. This article presents a discussion of the problem of Learning Non-Taxonomic Relationships of Ontologies and defines its generic process. Four techniques representing the state of the art of Learning Non-Taxonomic Relationships of Ontologies are described and the solutions they provide are discussed along with their advantages and limitations
The skill-based approach:developing and applying a modelling method based on skill reuse
Skill reuse is a commonly accepted aspect of human cognition. However, it is hardly ever applied in the construction of cognitive models. By not taking skill reuse into account, a risk exists that the models created in such a way are too specific and therefore do not add much to our knowledge of human cognition. In this dissertation we have developed a modelling approach that is centred around considering skill reuse and therefore can change this situation. By following this method skill reuse is considered and the models that are created following this method will add more to our general understanding of human cognition. This dissertation discusses the modelling approach, the steps we have taken in creating this approach and the approach is applied to two experimental paradigms
Reviewing the problem of learning non-taxonomic relationships of ontologies from text
Learning Non-Taxonomic Relationships is a sub-field of Ontology
Learning that aims at automating the extraction of these relationships from text.
This article discusses the problem of Learning Non-Taxonomic Relationships
of ontologies and proposes a generic process for approaching it. Some
techniques representing the state of the art of this field are discussed along with
their advantages and limitations. Finally, a framework for Learning Non-
Taxonomic Relationships being developed by the authors is briefly discussed.
This framework intends to be a customizable solution to reach good
effectiveness in the process of extraction of non-taxonomic relationships
according to the characteristics of the corpus.This work is supported by CNPq, CAPES and FAPEMA, research funding agencies of the Brazilian government
- …