45,407 research outputs found

    Building a Generation Knowledge Source using Internet-Accessible Newswire

    Full text link
    In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which uses a client-server architecture to extract noun-phrase descriptions of entities such as people, places, and organizations. The system serves two purposes: as an information extraction tool, it allows users to search for textual descriptions of entities; as a utility to generate functional descriptions (FD), it is used in a functional-unification based generation system. We present an evaluation of the approach and its applications to natural language generation and summarization.Comment: 8 pages, uses eps

    Ontology population for open-source intelligence: A GATE-based solution

    Get PDF
    Open-Source INTelligence is intelligence based on publicly available sources such as news sites, blogs, forums, etc. The Web is the primary source of information, but once data are crawled, they need to be interpreted and structured. Ontologies may play a crucial role in this process, but because of the vast amount of documents available, automatic mechanisms for their population are needed, starting from the crawled text. This paper presents an approach for the automatic population of predefined ontologies with data extracted from text and discusses the design and realization of a pipeline based on the General Architecture for Text Engineering system, which is interesting for both researchers and practitioners in the field. Some experimental results that are encouraging in terms of extracted correct instances of the ontology are also reported. Furthermore, the paper also describes an alternative approach and provides additional experiments for one of the phases of our pipeline, which requires the use of predefined dictionaries for relevant entities. Through such a variant, the manual workload required in this phase was reduced, still obtaining promising results

    Generating indicative-informative summaries with SumUM

    Get PDF
    We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies

    Learning Correlations between Linguistic Indicators and Semantic Constraints: Reuse of Context-Dependent Descriptions of Entities

    Get PDF
    This paper presents the results of a study on the semantic constraints imposed on lexical choice by certain contextual indicators. We show how such indicators are computed and how correlations between them and the choice of a noun phrase description of a named entity can be automatically established using supervised learning. Based on this correlation, we have developed a technique for automatic lexical choice of descriptions of entities in text generation. We discuss the underlying relationship between the pragmatics of choosing an appropriate description that serves a specific purpose in the automatically generated text and the semantics of the description itself. We present our work in the framework of the more general concept of reuse of linguistic structures that are automatically extracted from large corpora. We present a formal evaluation of our approach and we conclude with some thoughts on potential applications of our method.Comment: 7 pages, uses colacl.sty and acl.bst, uses epsfig. To appear in the Proceedings of the Joint 17th International Conference on Computational Linguistics 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL'98

    A Biologically Informed Hylomorphism

    Get PDF
    Although contemporary metaphysics has recently undergone a neo-Aristotelian revival wherein dispositions, or capacities are now commonplace in empirically grounded ontologies, being routinely utilised in theories of causality and modality, a central Aristotelian concept has yet to be given serious attention – the doctrine of hylomorphism. The reason for this is clear: while the Aristotelian ontological distinction between actuality and potentiality has proven to be a fruitful conceptual framework with which to model the operation of the natural world, the distinction between form and matter has yet to similarly earn its keep. In this chapter, I offer a first step toward showing that the hylomorphic framework is up to that task. To do so, I return to the birthplace of that doctrine - the biological realm. Utilising recent advances in developmental biology, I argue that the hylomorphic framework is an empirically adequate and conceptually rich explanatory schema with which to model the nature of organism

    Coping with lists in the ifcOWL ontology

    Get PDF
    Over the past few years, several suggestions have been made of how to convert an EXPRESS schema into an OWL ontology. The conversion from EXPRESS to OWL is of particular use to architectural design and construction industry, because one of the key data models in architectural design and construction industry, namely the Industry Foundation Classes (IFC) is represented using the EXPRESS information modelling language. In each of these conversion options, the way in which lists are converted (e.g. lists of coordinates, lists of spaces in a floor) is key to the structure and eventual strength of the resulting ontology. In this article, we outline and discuss the main decisions that can be made in converting LIST concepts in EXPRESS to equivalent OWL expressions. This allows one to identify which conversion option is appropriate to support proper and efficient information reuse in the domain of architecture and construction

    Spanish named entity recognition in the biomedical domain

    Get PDF
    Named Entity Recognition in the clinical domain and in languages different from English has the difficulty of the absence of complete dictionaries, the informality of texts, the polysemy of terms, the lack of accordance in the boundaries of an entity, the scarcity of corpora and of other resources available. We present a Named Entity Recognition method for poorly resourced languages. The method was tested with Spanish radiology reports and compared with a conditional random fields system.Peer ReviewedPostprint (author's final draft
    • …
    corecore