96,009 research outputs found

    Atomic: an open-source software platform for multi-level corpus annotation

    Get PDF
    This paper presents Atomic, an open-source platform-independent desktop application for multi-level corpus annotation. Atomic aims at providing the linguistic community with a user-friendly annotation tool and sustainable platform through its focus on extensibility, a generic data model, and compatibility with existing linguistic formats. It is implemented on top of the Eclipse Rich Client Platform, a pluggable Java-based framework for creating client applications. Atomic - as a set of plug-ins for this framework - integrates with the platform and allows other researchers to develop and integrate further extensions to the software as needed. The generic graph-based meta model Salt serves as Atomic’s domain model and allows for unlimited annotation levels and types. Salt is also used as an intermediate model in the Pepper framework for conversion of linguistic data, which is fully integrated into Atomic, making the latter compatible with a wide range of linguistic formats. Atomic provides tools for both less experienced and expert annotators: graphical, mouse-driven editors and a command-line data manipulation language for rapid annotation

    A discriminative approach to grounded spoken language understanding in interactive robotics

    Get PDF
    Spoken Language Understanding in Interactive Robotics provides computational models of human-machine communication based on the vocal input. However, robots operate in specific environments and the correct interpretation of the spoken sentences depends on the physical, cognitive and linguistic aspects triggered by the operational environment. Grounded language processing should exploit both the physical constraints of the context as well as knowledge assumptions of the robot. These include the subjective perception of the environment that explicitly affects linguistic reasoning. In this work, a standard linguistic pipeline for semantic parsing is extended toward a form of perceptually informed natural language processing that combines discriminative learning and distributional semantics. Empirical results achieve up to a 40% of relative error reduction

    Strict and non-strict negative concord in Hungarian: A unified analysis

    Get PDF
    Surányi (2006) observed that Hungarian has a hybrid (strict + non-strict) negative concord system. This paper proposes a uniform analysis of that system within the general framework of Zeijlstra (2004, 2008) and, especially, Chierchia (2013), with the following new ingredients. Sentential negation NEM is the same full negation in the presence of both strict and non-strict concord items. Preverbal SENKI `n-one’ type negative concord items occupy the specifier position of either NEM `not' or SEM `nor'. The latter, SEM spells out IS `too, even’ in the immediate scope of negation; it is a focus-sensitive head on the clausal spine. SEM can be seen as an overt counterpart of the phonetically null head that Chierchia dubs NEG; it is capable of invoking an abstract (disembodied) negation at the edge of its projection

    Ethics as Grammar: a Note on Method and the Treatise on Good Works

    Get PDF

    Laying the Foundation for In-car Alcohol Detection by Speech

    Get PDF
    The fact that an increasing number of functions in the automobile are and will be controlled by speech of the driver rises the question whether this speech input may be used to detect a possible alcoholic intoxication of the driver. For that matter a large part of the new Alcohol Language Corpus (ALC) edited by the Bavarian Archive of Speech Signals (BAS) will be used for a broad statistical investigation of possible feature candidates for classification. In this contribution we present the motivation and the design of the ALC corpus as well as first results from fundamental frequency and rhythm analysis. Our analysis by comparing sober and alcoholized speech of the same individuals suggests that there are in fact promising features that can automatically be derived from the speech signal during the speech recognition process and will indicate intoxication for most speakers
    • …
    corecore