88,132 research outputs found

    Ontology Learning and Semantic Annotation: a Necessary Symbiosis

    Get PDF
    Semantic annotation of text requires the dynamic merging of linguistically structured information and a ?world model?, usually represented as a domain-specific ontology. On the other hand, the process of engineering a domain-ontology through semi-automatic ontology learning system requires the availability of a considerable amount of semantically annotated documents. Facing this bootstrapping paradox requires an incremental process of annotation-acquisition-annotation, whereby domain-specific knowledge is acquired from linguistically-annotated texts and then projected back onto texts for extra linguistic information to be annotated and further knowledge layers to be extracted. The presented methodology is a first step in the direction of a full ?virtuous? circle where the semantic annotation platform and the evolving ontology interact in symbiosis. As a case study we have chosen the semantic annotation of product catalogues. We propose a hybrid approach, combining pattern matching techniques to exploit the regular structure of product descriptions in catalogues, and Natural Language Processing techniques which are resorted to analyze natural language descriptions. The semantic annotation involves the access to the ontology, semi-automatically bootstrapped with an ontology learning tool from annotated collections of catalogues

    Acquiring Word-Meaning Mappings for Natural Language Interfaces

    Full text link
    This paper focuses on a system, WOLFIE (WOrd Learning From Interpreted Examples), that acquires a semantic lexicon from a corpus of sentences paired with semantic representations. The lexicon learned consists of phrases paired with meaning representations. WOLFIE is part of an integrated system that learns to transform sentences into representations such as logical database queries. Experimental results are presented demonstrating WOLFIE's ability to learn useful lexicons for a database interface in four different natural languages. The usefulness of the lexicons learned by WOLFIE are compared to those acquired by a similar system, with results favorable to WOLFIE. A second set of experiments demonstrates WOLFIE's ability to scale to larger and more difficult, albeit artificially generated, corpora. In natural language acquisition, it is difficult to gather the annotated data needed for supervised learning; however, unannotated data is fairly plentiful. Active learning methods attempt to select for annotation and training only the most informative examples, and therefore are potentially very useful in natural language applications. However, most results to date for active learning have only considered standard classification tasks. To reduce annotation effort while maintaining accuracy, we apply active learning to semantic lexicons. We show that active learning can significantly reduce the number of annotated examples required to achieve a given level of performance

    MBT: A Memory-Based Part of Speech Tagger-Generator

    Full text link
    We introduce a memory-based approach to part of speech tagging. Memory-based learning is a form of supervised learning based on similarity-based reasoning. The part of speech tag of a word in a particular context is extrapolated from the most similar cases held in memory. Supervised learning approaches are useful when a tagged corpus is available as an example of the desired output of the tagger. Based on such a corpus, the tagger-generator automatically builds a tagger which is able to tag new text the same way, diminishing development time for the construction of a tagger considerably. Memory-based tagging shares this advantage with other statistical or machine learning approaches. Additional advantages specific to a memory-based approach include (i) the relatively small tagged corpus size sufficient for training, (ii) incremental learning, (iii) explanation capabilities, (iv) flexible integration of information in case representations, (v) its non-parametric nature, (vi) reasonably good results on unknown words without morphological analysis, and (vii) fast learning and tagging. In this paper we show that a large-scale application of the memory-based approach is feasible: we obtain a tagging accuracy that is on a par with that of known statistical approaches, and with attractive space and time complexity properties when using {\em IGTree}, a tree-based formalism for indexing and searching huge case bases.} The use of IGTree has as additional advantage that optimal context size for disambiguation is dynamically computed.Comment: 14 pages, 2 Postscript figure

    Developing a dominant logic of strategic innovation

    Get PDF
    Purpose: This paper aims to lay the foundations to develop a dominant logic and a common thematic framework of strategic innovation (SI) and to encourage consensus over the field’s core foundation of main themes. Design/methodology/approach: The paper explores the intersection between the constituent fields of strategic management and innovation management through a concept mapping process. The paper categorizes the main themes and search for common ground in order to develop the core thematic framework of SI. The paper looks at the sub-themes of SI in published research and develops a more detailed framework. The conceptual categories derived from the process are then placed in a logical sequence according to how they occur in practice or in the order of how the concepts develop from one other. Findings: The results yield seven main themes that form the main taxonomy of SI: types of SI, environmental analysis of SI, SI planning, enabling SI, collaborative networks, managing knowledge, and strategic outcomes. Research limitations/implications: The new thematic framework the paper is proposing for SI remains preliminary in nature and would need to be tried and tested by researchers and practitioners in order to gain acceptability. Academic rigor and methodological structure are not sufficient to determine whether our conceptual framework will become widely diffused in academia and industry. It would have to pass through an emergent, evolutionary process of selection, adoption and an inevitable degree of change and adaptation, just like any other innovation. Practical implications: The practical implications concern the production of instructive material and the application of strategic management initiatives in industry. The proposed themes and sub-themes can serve as a logical framework to develop and update publications, which have been instrumental in their own right to shape the field. The paper also provides a checklist of potential research projects in SI, which will improve and strengthen the field. The new framework provides a comprehensive checklist of strategic management initiatives that will help industry to initiate, plan and execute effective innovation strategies. Originality/value: The concept mapping of the themes of SI yields a new dominant logic, which will influence the evolution of the field and its relevance to both academia and industry

    Buzz monitoring in word space

    Get PDF
    This paper discusses the task of tracking mentions of some topically interesting textual entity from a continuously and dynamically changing flow of text, such as a news feed, the output from an Internet crawler or a similar text source - a task sometimes referred to as buzz monitoring. Standard approaches from the field of information access for identifying salient textual entities are reviewed, and it is argued that the dynamics of buzz monitoring calls for more accomplished analysis mechanisms than the typical text analysis tools provide today. The notion of word space is introduced, and it is argued that word spaces can be used to select the most salient markers for topicality, find associations those observations engender, and that they constitute an attractive foundation for building a representation well suited for the tracking and monitoring of mentions of the entity under consideration
    • …
    corecore