132 research outputs found

    Prosodic Annotation in a Thai Text-to-speech System

    Get PDF
    PACLIC 21 / Seoul National University, Seoul, Korea / November 1-3, 200

    SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks

    Get PDF
    In this paper, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the SCREEN system which is based on this new robust, learned and flat analysis. In this paper, we focus on a detailed description of SCREEN's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial Intelligence Research 6(1), 199

    Computational Approaches to the Syntax–Prosody Interface: Using Prosody to Improve Parsing

    Full text link
    Prosody has strong ties with syntax, since prosody can be used to resolve some syntactic ambiguities. Syntactic ambiguities have been shown to negatively impact automatic syntactic parsing, hence there is reason to believe that prosodic information can help improve parsing. This dissertation considers a number of approaches that aim to computationally examine the relationship between prosody and syntax of natural languages, while also addressing the role of syntactic phrase length, with the ultimate goal of using prosody to improve parsing. Chapter 2 examines the effect of syntactic phrase length on prosody in double center embedded sentences in French. Data collected in a previous study were reanalyzed using native speaker judgment and automatic methods (forced alignment). Results demonstrate similar prosodic splitting behavior as in English in contradiction to the original study’s findings. Chapter 3 presents a number of studies examining whether syntactic ambiguity can yield different prosodic patterns, allowing humans and/or computers to resolve the ambiguity. In an experimental study, humans disambiguated sentences with prepositional phrase- (PP)-attachment ambiguity with 49% accuracy presented as text, and 63% presented as audio. Machine learning on the same data yielded an accuracy of 63-73%. A corpus study on the Switchboard corpus used both prosodic breaks and phrase lengths to predict the attachment, with an accuracy of 63.5% for PP-attachment sentences, and 71.2% for relative clause attachment. Chapter 4 aims to identify aspects of syntax that relate to prosody and use these in combination with prosodic cues to improve parsing. The aspects identified (dependency configurations) are based on dependency structure, reflecting the relative head location of two consecutive words, and are used as syntactic features in an ensemble system based on Recurrent Neural Networks, to score parse hypotheses and select the most likely parse for a given sentence. Using syntactic features alone, the system achieved an improvement of 1.1% absolute in Unlabelled Attachment Score (UAS) on the test set, above the best parser in the ensemble, while using syntactic features combined with prosodic features (pauses and normalized duration) led to a further improvement of 0.4% absolute. The results achieved demonstrate the relationship between syntax, syntactic phrase length, and prosody, and indicate the ability and future potential of prosody to resolve ambiguity and improve parsing

    How functional programming mattered

    Get PDF
    In 1989 when functional programming was still considered a niche topic, Hughes wrote a visionary paper arguing convincingly ‘why functional programming matters’. More than two decades have passed. Has functional programming really mattered? Our answer is a resounding ‘Yes!’. Functional programming is now at the forefront of a new generation of programming technologies, and enjoying increasing popularity and influence. In this paper, we review the impact of functional programming, focusing on how it has changed the way we may construct programs, the way we may verify programs, and fundamentally the way we may think about programs

    Parsing for prosody: What a text-to-speech system needs from syntax

    Get PDF
    The authors describe an experimental text-to-speech system that uses a syntactic parser and prosody rules to determine prosodic phrasing for synthesized speech. It is shown that many aspects of sentence analysis that are required for other parsing applications, e.g., machine translation and question answering, become unnecessary in parsing for text-to-speech. It is possible to generate natural-sounding prosodic phrasing by relying on information about syntactic category type, partial constituency, and length; information about clausal and verb phrase constituency, predicate-argument relations, and prepositional phrase attachment can be bypassed

    Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening

    Get PDF
    The neural correlates of sentence production have been mostly studied with constraining task paradigms that introduce artificial task effects. In this study, we aimed to gain a better understanding of syntactic processing in spontaneous production vs. naturalistic comprehension. We extracted word-by-word metrics of phrase-structure building with top-down and bottom-up parsers that make different hypotheses about the timing of structure building. In comprehension, structure building proceeded in an integratory fashion and led to an increase in activity in posterior temporal and inferior frontal areas. In production, structure building was anticipatory and predicted an increase in activity in the inferior frontal gyrus. Newly developed production-specific parsers highlighted the anticipatory and incremental nature of structure building in production, which was confirmed by a converging analysis of the pausing patterns in speech. Overall, the results showed that the unfolding of syntactic processing diverges between speaking and listening

    Computational Language Assessment in patients with speech, language, and communication impairments

    Full text link
    Speech, language, and communication symptoms enable the early detection, diagnosis, treatment planning, and monitoring of neurocognitive disease progression. Nevertheless, traditional manual neurologic assessment, the speech and language evaluation standard, is time-consuming and resource-intensive for clinicians. We argue that Computational Language Assessment (C.L.A.) is an improvement over conventional manual neurological assessment. Using machine learning, natural language processing, and signal processing, C.L.A. provides a neuro-cognitive evaluation of speech, language, and communication in elderly and high-risk individuals for dementia. ii. facilitates the diagnosis, prognosis, and therapy efficacy in at-risk and language-impaired populations; and iii. allows easier extensibility to assess patients from a wide range of languages. Also, C.L.A. employs Artificial Intelligence models to inform theory on the relationship between language symptoms and their neural bases. It significantly advances our ability to optimize the prevention and treatment of elderly individuals with communication disorders, allowing them to age gracefully with social engagement.Comment: 36 pages, 2 figures, to be submite

    Parsing the Billboard Chord Transcriptions

    Get PDF

    Integrating Prosodics into a Language Model for Spoken Language Understanding of Thai

    Get PDF
    PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200
    • …
    corecore