426,934 research outputs found

    FrameNet CNL: a Knowledge Representation and Information Extraction Language

    Full text link
    The paper presents a FrameNet-based information extraction and knowledge representation framework, called FrameNet-CNL. The framework is used on natural language documents and represents the extracted knowledge in a tailor-made Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be generated automatically in multiple languages. This approach brings together the fields of information extraction and CNL, because a source text can be considered belonging to FrameNet-CNL, if information extraction parser produces the correct knowledge representation as a result. We describe a state-of-the-art information extraction parser used by a national news agency and speculate that FrameNet-CNL eventually could shape the natural language subset used for writing the newswire articles.Comment: CNL-2014 camera-ready version. The final publication is available at link.springer.co

    EliXR-TIME: A Temporal Knowledge Representation for Clinical Research Eligibility Criteria.

    Get PDF
    Effective clinical text processing requires accurate extraction and representation of temporal expressions. Multiple temporal information extraction models were developed but a similar need for extracting temporal expressions in eligibility criteria (e.g., for eligibility determination) remains. We identified the temporal knowledge representation requirements of eligibility criteria by reviewing 100 temporal criteria. We developed EliXR-TIME, a frame-based representation designed to support semantic annotation for temporal expressions in eligibility criteria by reusing applicable classes from well-known clinical temporal knowledge representations. We used EliXR-TIME to analyze a training set of 50 new temporal eligibility criteria. We evaluated EliXR-TIME using an additional random sample of 20 eligibility criteria with temporal expressions that have no overlap with the training data, yielding 92.7% (76 / 82) inter-coder agreement on sentence chunking and 72% (72 / 100) agreement on semantic annotation. We conclude that this knowledge representation can facilitate semantic annotation of the temporal expressions in eligibility criteria

    Beyond information extraction: The role of ontology in military report processing

    Get PDF
    Information extraction tools like SMES transform natural language into formal representation, e.g. into feature structures. Doing so, these tools exploit and apply linguistic knowledge about the syntactic and morphological regularities of the language used. However, these tools apply semantic as well as pragmatic knowledge only partially at best. Automatic processing of military reports has to result in a visualization of the reports content by map as well as in an actualization of the underlying database in order to allow for the actualization of the common operational picture. Normally, however, the information provided by the result of the information extraction is not explicit enough for visualization processes and database insertions. This originates from the reports themselves that are elliptical, ambiguous, and vague. In order to overcome this obstacle, the situational context and thus semantic and pragmatic aspects have to be taken into account. In the paper at hand, we present a system that uses an ontological module to integrate semantic and pragmatic knowledge. The result of the completion contains all the specifications to allow for a visualization of the report’s content on a map as well as for a database actualization

    Ontology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships

    Get PDF
    Many applications call for methods to enable automatic extraction of structured information from unstructured natural language text. Due to the inherent challenges of natural language processing, most of the existing methods for information extraction from text tend to be domain specific. This thesis explores a modular ontology-based approach to information extraction that decouples domain-specific knowledge from the rules used for information extraction. Specifically, the thesis describes: 1. A framework for ontology-driven extraction of a subset of nested complex relationships (e.g., Joe reports that Jim is a reliable employee) from free text. The extracted relationships are semantically represented in the form of RDF (resource description framework) graphs, which can be stored in RDF knowledge bases and queried using query languages for RDF. 2. An open source implementation of SEMANTIXS, a system for ontology-guided extraction and semantic representation of structured information from unstructured text. 3. Results of experiments that offer evidence of the utility of the proposed ontology-based approach to extract complex relationships from text

    User driven information extraction with LODIE

    Get PDF
    Information Extraction (IE) is the technique for transforming unstructured or semi-structured data into structured representation that can be understood by machines. In this paper we use a user-driven Information Extraction technique to wrap entity-centric Web pages. The user can select concepts and properties of interest from available Linked Data. Given a number of websites containing pages about the concepts of interest, the method will exploit (i) recurrent structures in the Web pages and (ii) available knowledge in Linked data to extract the information of interest from the Web pages
    • …
    corecore