Search CORE

426,934 research outputs found

FrameNet CNL: a Knowledge Representation and Information Extraction Language

Author: Barzdins Guntis
Publication venue
Publication date: 01/01/2014
Field of study

The paper presents a FrameNet-based information extraction and knowledge representation framework, called FrameNet-CNL. The framework is used on natural language documents and represents the extracted knowledge in a tailor-made Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be generated automatically in multiple languages. This approach brings together the fields of information extraction and CNL, because a source text can be considered belonging to FrameNet-CNL, if information extraction parser produces the correct knowledge representation as a result. We describe a state-of-the-art information extraction parser used by a national news agency and speculate that FrameNet-CNL eventually could shape the natural language subset used for writing the newswire articles.Comment: CNL-2014 camera-ready version. The final publication is available at link.springer.co

arXiv.org e-Print Archive

Crossref

E-resource repository of the University of Latvia

EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria.

Author: Boland Mary Regina
Carini Simona
Sim Ida
Tu Samson W
Weng Chunhua
Publication venue: eScholarship, University of California
Publication date: 01/01/2012
Field of study

Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria

PubMed Central

eScholarship - University of California

Beyond information extraction: The role of ontology in military report processing

Author: Frey Miloslaw
Schade Dr. Ulrich
Publication venue: E. Buchberger
Publication date: 01/01/2004
Field of study

Information extraction tools like SMES transform natural language into formal representation, e.g. into feature structures. Doing so, these tools exploit and apply linguistic knowledge about the syntactic and morphological regularities of the language used. However, these tools apply semantic as well as pragmatic knowledge only partially at best. Automatic processing of military reports has to result in a visualization of the reports content by map as well as in an actualization of the underlying database in order to allow for the actualization of the common operational picture. Normally, however, the information provided by the result of the information extraction is not explicit enough for visualization processes and database insertions. This originates from the reports themselves that are elliptical, ambiguous, and vague. In order to overcome this obstacle, the situational context and thus semantic and pragmatic aspects have to be taken into account. In the paper at hand, we present a system that uses an ontological module to integrate semantic and pragmatic knowledge. The result of the completion contains all the specifications to allow for a visualization of the report’s content on a map as well as for a database actualization

CogPrints Cognitive Sciences Eprint Archive

Ontology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships

Author: Pandit Sushain
Pandit Sushain
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2010
Field of study

Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. Due to the inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. This thesis explores a modular ontology-based approach to information extraction that decouples domain-specific knowledge from the rules used for information extraction. Specifically, the thesis describes: 1. A framework for ontology-driven extraction of a subset of nested complex relationships (e.g., Joe reports that Jim is a reliable employee) from free text. The extracted relationships are semantically represented in the form of RDF (resource description framework) graphs, which can be stored in RDF knowledge bases and queried using query languages for RDF. 2. An open source implementation of SEMANTIXS, a system for ontology-guided extraction and semantic representation of structured information from unstructured text. 3. Results of experiments that offer evidence of the utility of the proposed ontology-based approach to extract complex relationships from text

Digital Repository @ Iowa State University (ISU)

User driven information extraction with LODIE

Author: Gentile Anna Lisa
Mazumdar Suvodeep
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2014
Field of study

Information Extraction (IE) is the technique for transforming unstructured or semi-structured data into structured representation that can be understood by machines. In this paper we use a user-driven Information Extraction technique to wrap entity-centric Web pages. The user can select concepts and properties of interest from available Linked Data. Given a number of websites containing pages about the concepts of interest, the method will exploit (i) recurrent structures in the Web pages and (ii) available knowledge in Linked data to extract the information of interest from the Web pages

CiteSeerX

Sheffield Hallam University Research Archive

MAnnheim DOCument Server