101 research outputs found

    Proceedings of the Conference on Natural Language Processing 2010

    Get PDF
    This book contains state-of-the-art contributions to the 10th conference on Natural Language Processing, KONVENS 2010 (Konferenz zur Verarbeitung natürlicher Sprache), with a focus on semantic processing. The KONVENS in general aims at offering a broad perspective on current research and developments within the interdisciplinary field of natural language processing. The central theme draws specific attention towards addressing linguistic aspects ofmeaning, covering deep as well as shallow approaches to semantic processing. The contributions address both knowledgebased and data-driven methods for modelling and acquiring semantic information, and discuss the role of semantic information in applications of language technology. The articles demonstrate the importance of semantic processing, and present novel and creative approaches to natural language processing in general. Some contributions put their focus on developing and improving NLP systems for tasks like Named Entity Recognition or Word Sense Disambiguation, or focus on semantic knowledge acquisition and exploitation with respect to collaboratively built ressources, or harvesting semantic information in virtual games. Others are set within the context of real-world applications, such as Authoring Aids, Text Summarisation and Information Retrieval. The collection highlights the importance of semantic processing for different areas and applications in Natural Language Processing, and provides the reader with an overview of current research in this field

    Multiple Timescale Feature Combination towards Robust Speech Recognition

    Get PDF
    While a lot of progress has been made during the last years in the field of Automatic Speech recognition (ASR), one of the main remaining problems is that of robustness. Typically, state-of-the-art ASR systems work very efficiently in well-defined environments, e.g. for clean speech or known noise conditions. However, their performance degrades drastically under different conditions. Many approaches have been developed to circumvent this problem, ranging from noise cancellation to system adaptation techniques. This paper investigates the influence of using additional information from relatively long timescales to noise robustness. The multiple timescale feature combination approach is introduced. Experiments show that, while maintaining recognition performance for clean speech, robustness could be improved in noisy conditions

    Sprachliche Variabilität des Deutschen und ihre Erfassung mit Methoden der automatischen Spracherkennung

    Get PDF
    Die Datenbank wird auf den Ergebnissen der Analyse einschlägiger umfangreicher Korpora des gesprochenen Deutsch basieren. Um jedoch große Korpora analysieren zu können, ist es notwendig, automatische Analyseverfahren der Variation zu entwickeln. Mit traditionellen manuellen Methoden kann der Aufbau einer korpusbasierten Datenbank kaum verwirklicht werden. Dem eigentlichen Variationsprojekt wurde daher eine kleine Pilotstudie vorgeschaltet, die die Möglichkeiten der automatischen Analyse prüfen sollte. Dabei wurde der Frage nachgegangen, ob es möglich ist, regionale Varianten des Deutschen mit Verfahren der automatischen Spracherkennung zu untersuchen, d.h., ob es möglich ist, eine verlässliche Transkription der regionalen Varianten automatisch herzustellen. Diese Pilotstudie zur automatischen Transkription stützte sich auf das im IDS bereits vorhandene System SPRAT (Speech Recognition and Alignment Tool), das zum Alignieren (Text-Ton-Synchronisation) verwendet wird. Im Rahmen der Pilotstudie wurde dieses System modifiziert und in einer Reihe von Tests dessen automatische Transkription evaluiert (vgl. Abschnitt 3). Das Ziel des vorliegenden Beitrags ist es, die Ergebnisse dieser Pilotstudie vorzustellen. Zunächst aber soll ein kurzer Exkurs verdeutlichen, um welches System es sich beim IDS-Aligner SPRAT handelt

    Rover und TüNDRA: Such- und Visualisierungsplattformen für Wortnetze und Baumbanken

    Get PDF
    Geeignete Such- und Visualisierungswerkzeuge, idealiter in Form von Webapplikationen, sind für den benutzerfreundlichen Zugang zu Sprachressourcen von großer Bedeutung. In diesem Beitrag stellen wir die Webapplikationen Rover und TüNDRA vor, die am CLARIN-D Zentrum Tübingen im Rahmen des BMBF-Projekts CLARIN-D entwickelt wurden

    Syntactic Reference Corpus of Medieval French (SRCMF)

    Get PDF
    International audienceThe aim of the SRCMF project is to mark up syntactic structures in the texts of two major old French corpora, the Base de Français Médiéval and the Nouveau Corpus d'Amsterdam. The annotation model is dependency-based. At first, the texts are marked up manually. Later, this annotation is used as "gold standard" to train automatic parsers. The corpus can currently be searched using TigerSearch software. Project documentation will be published on the web as soon as it gets stable stable, and access to the corpus will be provided to researchers upon motivated request

    Continuous variation in computational morphology - the example of Swiss German

    Get PDF
    International audienceMost work in natural language processing is geared towards written, standardized language varieties. This focus is generally justified on practical grounds of data availability and socio-economical relevance, but does not always reflect the linguistic reality of sub-standard varieties. In this paper, we aim at the computational description of the morphology of a language with continuous internal variation, as it is encountered in most dialect landscapes. The work presented here is applied to Swiss German dialects; these dialects are well documented through dialectological research and are among the most lively ones in Europe in terms of social acceptance and media exposure. Our work is inspired by previous research in generative dialectology and computational linguistics, which attempts to derive multiple dialect systems from a single reference system with the help of hand-written transformation rules. Such transformation rules may be called \textbf{georeferenced}, in the sense that they link to a set of geographic coordinates that can be grounded on a map. We improve on this work in several respects. First, our model associates all rules with probabilistic maps extracted from linguistic atlases. This allows us to handle transition zones in which several variants are accepted. Second, we provide a full implementation of this model on the basis of finite-state transducers. In addition to finite-state composition, which derives dialectal word forms by applying several rules in cascade, we propose a second type of composition, map composition, to compute the area of validity of the derived word forms on the basis of the probabilistic maps associated with the rules. In this paper, we will focus on two aspects of the proposed model: its theoretical value as a computationally effective description of continuous linguistic variation, and its practical value as a word-level machine translation system from Standard German into the various Swiss German dialects. We evaluate the model on the latter aspect

    Wissenschaftlich-Technischer Jahresbericht 1992

    Get PDF

    Hybride konnektionistische, statistische und regelbasierte Ansätze zur Verarbeitung natürlicher Sprache : Workshop auf der 21. Deutschen Jahrestagung für Künstliche Intelligenz, Freiburg, 9.-10. September 1997

    Get PDF
    corecore