10,868 research outputs found

    Development of an ontology for aerospace engine components degradation in service

    Get PDF
    This paper presents the development of an ontology for component service degradation. In this paper, degradation mechanisms in gas turbine metallic components are used for a case study to explain how a taxonomy within an ontology can be validated. The validation method used in this paper uses an iterative process and sanity checks. Data extracted from on-demand textual information are filtered and grouped into classes of degradation mechanisms. Various concepts are systematically and hierarchically arranged for use in the service maintenance ontology. The allocation of the mechanisms to the AS-IS ontology presents a robust data collection hub. Data integrity is guaranteed when the TO-BE ontology is introduced to analyse processes relative to various failure events. The initial evaluation reveals improvement in the performance of the TO-BE domain ontology based on iterations and updates with recognised mechanisms. The information extracted and collected is required to improve service k nowledge and performance feedback which are important for service engineers. Existing research areas such as natural language processing, knowledge management, and information extraction were also examined

    FrameNet CNL: a Knowledge Representation and Information Extraction Language

    Full text link
    The paper presents a FrameNet-based information extraction and knowledge representation framework, called FrameNet-CNL. The framework is used on natural language documents and represents the extracted knowledge in a tailor-made Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be generated automatically in multiple languages. This approach brings together the fields of information extraction and CNL, because a source text can be considered belonging to FrameNet-CNL, if information extraction parser produces the correct knowledge representation as a result. We describe a state-of-the-art information extraction parser used by a national news agency and speculate that FrameNet-CNL eventually could shape the natural language subset used for writing the newswire articles.Comment: CNL-2014 camera-ready version. The final publication is available at link.springer.co

    Ontologies and Information Extraction

    Full text link
    This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE

    A framework for interrogating social media images to reveal an emergent archive of war

    Get PDF
    The visual image has long been central to how war is seen, contested and legitimised, remembered and forgotten. Archives are pivotal to these ends as is their ownership and access, from state and other official repositories through to the countless photographs scattered and hidden from a collective understanding of what war looks like in individual collections and dusty attics. With the advent and rapid development of social media, however, the amateur and the professional, the illicit and the sanctioned, the personal and the official, and the past and the present, all seem to inhabit the same connected and chaotic space.However, to even begin to render intelligible the complexity, scale and volume of what war looks like in social media archives is a considerable task, given the limitations of any traditional human-based method of collection and analysis. We thus propose the production of a series of ‘snapshots’, using computer-aided extraction and identification techniques to try to offer an experimental way in to conceiving a new imaginary of war. We were particularly interested in testing to see if twentieth century wars, obviously initially captured via pre-digital means, had become more ‘settled’ over time in terms of their remediated presence today through their visual representations and connections on social media, compared with wars fought in digital media ecologies (i.e. those fought and initially represented amidst the volume and pervasiveness of social media images).To this end, we developed a framework for automatically extracting and analysing war images that appear in social media, using both the features of the images themselves, and the text and metadata associated with each image. The framework utilises a workflow comprising four core stages: (1) information retrieval, (2) data pre-processing, (3) feature extraction, and (4) machine learning. Our corpus was drawn from the social media platforms Facebook and Flickr

    Design and enhanced evaluation of a robust anaphor resolution algorithm

    Get PDF
    Syntactic coindexing restrictions are by now known to be of central importance to practical anaphor resolution approaches. Since, in particular due to structural ambiguity, the assumption of the availability of a unique syntactic reading proves to be unrealistic, robust anaphor resolution relies on techniques to overcome this deficiency. This paper describes the ROSANA approach, which generalizes the verification of coindexing restrictions in order to make it applicable to the deficient syntactic descriptions that are provided by a robust state-of-the-art parser. By a formal evaluation on two corpora that differ with respect to text genre and domain, it is shown that ROSANA achieves high-quality robust coreference resolution. Moreover, by an in-depth analysis, it is proven that the robust implementation of syntactic disjoint reference is nearly optimal. The study reveals that, compared with approaches that rely on shallow preprocessing, the largely nonheuristic disjoint reference algorithmization opens up the possibility/or a slight improvement. Furthermore, it is shown that more significant gains are to be expected elsewhere, particularly from a text-genre-specific choice of preference strategies. The performance study of the ROSANA system crucially rests on an enhanced evaluation methodology for coreference resolution systems, the development of which constitutes the second major contribution o/the paper. As a supplement to the model-theoretic scoring scheme that was developed for the Message Understanding Conference (MUC) evaluations, additional evaluation measures are defined that, on one hand, support the developer of anaphor resolution systems, and, on the other hand, shed light on application aspects of pronoun interpretation
    • …
    corecore