426,934 research outputs found
FrameNet CNL: a Knowledge Representation and Information Extraction Language
The paper presents a FrameNet-based information extraction and knowledge
representation framework, called FrameNet-CNL. The framework is used on natural
language documents and represents the extracted knowledge in a tailor-made
Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be
generated automatically in multiple languages. This approach brings together
the fields of information extraction and CNL, because a source text can be
considered belonging to FrameNet-CNL, if information extraction parser produces
the correct knowledge representation as a result. We describe a
state-of-the-art information extraction parser used by a national news agency
and speculate that FrameNet-CNL eventually could shape the natural language
subset used for writing the newswire articles.Comment: CNL-2014 camera-ready version. The final publication is available at
link.springer.co
EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria.
Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria
Beyond information extraction: The role of ontology in military report processing
Information extraction tools like SMES transform natural language into formal representation, e.g. into feature structures. Doing so, these tools exploit and apply linguistic knowledge about the syntactic and morphological regularities of the language used. However, these tools apply semantic as well as pragmatic knowledge only partially at best. Automatic processing of military reports has to result in a visualization of the reports content by map as well as in an actualization of the underlying database in order to allow for the actualization of the common operational picture. Normally, however, the information provided by the result of the information extraction is not explicit enough for visualization processes and database insertions. This originates from the reports themselves that are elliptical, ambiguous, and vague. In order to overcome this obstacle, the situational context and thus semantic and pragmatic aspects have to be taken into account.
In the paper at hand, we present a system that uses an ontological module to integrate semantic and pragmatic knowledge. The result of the completion contains all the specifications to allow for a visualization of the report’s content on a map as well as for a database actualization
Ontology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships
Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. Due to the inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. This thesis explores a modular ontology-based approach to information extraction that decouples domain-specific knowledge from the rules used for information extraction. Specifically, the thesis describes:
1. A framework for ontology-driven extraction of a subset of nested complex relationships (e.g., Joe reports that Jim is a reliable employee) from free text. The extracted relationships are semantically represented in the form of RDF (resource description framework) graphs, which can be stored in RDF knowledge bases and queried using query languages for RDF.
2. An open source implementation of SEMANTIXS, a system for ontology-guided extraction and semantic representation of structured information from unstructured text.
3. Results of experiments that offer evidence of the utility of the proposed ontology-based approach to extract complex relationships from text
User driven information extraction with LODIE
Information Extraction (IE) is the technique for transforming unstructured or semi-structured data into structured representation
that can be understood by machines. In this paper we use a user-driven
Information Extraction technique to wrap entity-centric Web pages. The
user can select concepts and properties of interest from available Linked
Data. Given a number of websites containing pages about the concepts of
interest, the method will exploit (i) recurrent structures in the Web pages
and (ii) available knowledge in Linked data to extract the information
of interest from the Web pages
- …