1,051 research outputs found

    The Lexicon Graph Model : a generic model for multimodal lexicon development

    Get PDF
    Trippel T. The Lexicon Graph Model : a generic model for multimodal lexicon development. Bielefeld (Germany): Bielefeld University; 2006.Das Lexicon Graph Model stellt ein Modell fĂŒr Lexika dar, die korpusbasiert sein können und multimodale Informationen enthalten. Hierbei wird die Perspektive der Lexikontheorie eingenommen, wobei die zugrundeliegenden Datenstrukturen sowohl vom Lexikon als auch von Annotationen betrachtet werden. Letztere fallen dadurch in das Blickfeld, weil sie als Grundlage fĂŒr die Erstellung von Lexika gesehen werden. Der Begriff des Lexikons bezieht sich hier sowohl auf den Bereich des Wörterbuchs als auch der in elektronischen Applikationen integrierten Lexikondatenbanken. Die existierenden Formalismen und AnsĂ€tze der Lexikonentwicklung zeigen verschiedene Probleme im Zusammenhang mit Lexika auf, etwa die Zusammenfassung von existierenden Lexika zu einem, die Disambiguierung von Mehrdeutigkeiten im Lexikon auf verschiedenen lexikalischen Ebenen, die ReprĂ€sentation von anderen ModalitĂ€ten im Lexikon, die Selektion des lexikalischen SchlĂŒsselbegriffs fĂŒr Lexikonartikel, etc. Der vorliegende Ansatz geht davon aus, dass sich Lexika zwar in ihrem Inhalt, nicht aber in einer grundlegenden Struktur unterscheiden, so dass verschiedenartige Lexika im Rahmen eines Unifikationsprozesses dublettenfrei miteinander verbunden werden können. Hieraus resultieren deklarative Lexika. FĂŒr Lexika können diese Graphen mit dem Lexikongraph-Modell wie hier dargestellt modelliert werden. Dabei sind Lexikongraphen analog den von Bird und Libermann beschriebenen Annotationsgraphen gesehen und können daher auch Ă€hnlich verarbeitet werden. Die Untersuchung des Lexikonformalismus beruht auf vier Schritten. ZunĂ€chst werden existierende Lexika analysiert und beschrieben. Danach wird mit dem Lexikongraph-Modell eine generische Darstellung von Lexika vorgestellt, die auch implementiert und getestet wird. Basierend auf diesem Formalismus wird die Beziehung zu Annotationsgraphen hergestellt, wobei auch beschrieben wird, welche MaßstĂ€be an angemessene Annotationen fĂŒr die Verwendung zur Lexikonentwicklung angelegt werden mĂŒssen.The Lexicon Graph Model provides a model and framework for lexicons that can be corpus based and contain multimodal information. The focus is more from the lexicon theory perspective, looking at the underlying data structures that are part of existing lexicons and corpora. The term lexicon in linguistics and artificial intelligence is used in different ways, including traditional print dictionaries in book form, CD-ROM editions, Web based versions of the same, but also computerized resources of similar structures to be used by applications. These applications cover systems for human-machine communication as well as spell checkers. The term lexicon in this work is used as the most generic term covering all lexical applications. Existing formalisms in lexicon development show different problems with lexicons, for example combining different kinds of lexical resources, disambiguation on different lexical levels, the representation of different modalities in a lexicon. The Lexicon Graph Model presupposes that lexicons can have different structures but have fundamentally a similar structure, making it possible to combine lexicons in a unification process, resulting in a declarative lexicon. The underlying model is a graph, the Lexicon Graph, which is modeled similar to Annotation Graphs as described by Bird and Libermann. The investigation of the lexicon formalism contains four steps, that is the analysis of existing lexicons, the introduction of the Lexicon Graph Model as a generic representation for lexicons, the implementation of the formalism in different contexts and an evaluation of the formalism. It is shown that Annotation Graphs and Lexicon Graphs are indeed related not only in their formalism and it is shown, what standards have to be applied to annotations to be usable for lexicon development

    The Lexical Grid: Lexical Resources in Language Infrastructures

    Get PDF
    Language Resources are recognized as a central and strategic for the development of any Human Language Technology system and application product. they play a critical role as horizontal technology and have been recognized in many occasions as a priority also by national and spra-national funding a number of initiatives (such as EAGLES, ISLE, ELRA) to establish some sort of coordination of LR activities, and a number of large LR creation projects, both in the written and in the speech areas

    DFKI publications : the first four years ; 1990 - 1993

    Get PDF

    Discovering Lexical Generalisations. A Supervised Machine Learning Approach to Inheritance Hierarchy Construction

    Get PDF
    Institute for Communicating and Collaborative SystemsGrammar development over the last decades has seen a shift away from large inventories of grammar rules to richer lexical structures. Many modern grammar theories are highly lexicalised. But simply listing lexical entries typically results in an undesirable amount of redundancy. Lexical inheritance hierarchies, on the other hand, make it possible to capture linguistic generalisations and thereby reduce redundancy. Inheritance hierarchies are usually constructed by hand but this is time-consuming and often impractical if a lexicon is very large. Constructing hierarchies automatically or semiautomatically facilitates a more systematic analysis of the lexical data. In addition, lexical data is often extracted automatically from corpora and this is likely to increase over the coming years. Therefore it makes sense to go a step further and automate the hierarchical organisation of lexical data too. Previous approaches to automatic lexical inheritance hierarchy construction tended to focus on minimality criteria, aiming for hierarchies that minimised one or more criteria such as the number of path-value pairs, the number of nodes or the number of inheritance links (Petersen 2001, Barg 1996a, and in a slightly different context: Light 1994). Aiming for minimality is motivated by the fact that the conciseness of inheritance hierarchies is a main reason for their use. However, I will argue that there are several problems with minimality-based approaches. First, minimality is not well defined in the context of lexical inheritance hierarchies as there is a tension between different minimality criteria. Second, minimality-based approaches tend to underestimate the importance of linguistic plausibility. While such approaches start with a definition of minimal redundancy and then try to prove that this leads to plausible hierarchies, the approach suggested here takes the opposite direction. It starts with a manually built hierarchy to which a supervised machine learning algorithm is applied with the aim of finding a set of formal criteria that can guide the construction of plausible hierarchies. Taking this direction means that it is more likely that the selected criteria do in fact lead to plausible hierarchies. Using a machine learning technique also has the advantage that the set of criteria can be much larger than in hand-crafted definitions. Consequently, one can define conciseness in very broad terms, taking into account interdependencies in the data as well as simple minimality criteria. This leads to a more fine-grained model of hierarchy quality. In practice, the method proposed here consists of two components: Galois lattices are used to define the search space as the set of all generalisations over the input lexicon. Maximum entropy models which have been trained on a manually built hierarchy are then applied to the lattice of the input lexicon to distinguish between plausible and implausible generalisations based on the formal criteria that were found in the training step. An inheritance hierarchy is then derived by pruning implausible generalisations. The hierarchy is automatically evaluated by matching it to a manually built hierarchy for the input lexicon. Automatically constructing lexical hierarchies is a hard task, partly because what is considered the best hierarchy for a lexicon is to some extent subjective. Supervised learning methods also suffer from a lack of suitable training data. Hence, a semi-automatic architecture may be best suited for the task. Therefore, the performance of the system has been tested using a semi-automatic as well as an automatic architecture and it has also been compared to the performance achieved by the pruning algorithm suggested by Petersen (2001). The findings show that the method proposed here is well suited for semi-automatic hierarchy construction

    PIM : planning in manufacturing using skeletal plans and features

    Get PDF
    In order to create a production plan from product model data, a human expert thinks in a special terminology with respect to the given work piece and its production plan: He recognizes certain features and associates fragments of a production plan. By combining these skeletal plans he generates the complete production plan. We present a set of representation formalisms suitable for the modelling of this approach. When an expert\u27s knowledge has been represented using these formalisms, the generation of a production plan can be achieved by a sequence of abstraction, selection and refinement. This is demonstrated in the CAPP-system PIM, which is currently developed as a prototype. The close modelling of the knowledge of the concrete expert (or the accumulated know-how of a concrete factory) facilitate the development of planning systems which are especially tailored to the concrete manufacturing environment and optimally use the expert\u27s knowledge and should also lead to improved acceptance of the system
    • 

    corecore