3,696 research outputs found

    Generating Tailored, Comparative Descriptions with Contextually Appropriate Intonation

    Get PDF
    Generating responses that take user preferences into account requires adaptation at all levels of the generation process. This article describes a multi-level approach to presenting user-tailored information in spoken dialogues which brings together for the first time multi-attribute decision models, strategic content planning, surface realization that incorporates prosody prediction, and unit selection synthesis that takes the resulting prosodic structure into account. The system selects the most important options to mention and the attributes that are most relevant to choosing between them, based on the user model. Multiple options are selected when each offers a compelling trade-off. To convey these trade-offs, the system employs a novel presentation strategy which straightforwardly lends itself to the determination of information structure, as well as the contents of referring expressions. During surface realization, the prosodic structure is derived from the information structure using Combinatory Categorial Grammar in a way that allows phrase boundaries to be determined in a flexible, data-driven fashion. This approach to choosing pitch accents and edge tones is shown to yield prosodic structures with significantly higher acceptability than baseline prosody prediction models in an expert evaluation. These prosodic structures are then shown to enable perceptibly more natural synthesis using a unit selection voice that aims to produce the target tunes, in comparison to two baseline synthetic voices. An expert evaluation and f0 analysis confirm the superiority of the generator-driven intonation and its contribution to listeners' ratings

    Strategies for developing a conversational speech dataset for Text-To-Speech Synthesis

    Get PDF
    Funding Information: The first author has received funding from the European Union's Horizon 2020 research and innovation program under the Marie Skłodowska Curie grant agreement No 859588. The authors are thankful to Maaike Groenewege, Johannah O'Mahony and ReadSpeaker's R&D team whose suggestions and discussions have been instrumental in shaping the direction of this paper. Funding Information: The first author has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska Curie grant agreement No 859588. The authors are thankful to Maaike Groenewege, Johannah O’Mahony and ReadSpeaker’s R&D team whose suggestions and discussions have been instrumental in shaping the direction of this paper. Publisher Copyright: Copyright © 2022 ISCA.There have been many efforts to improve the quality of speech synthesis systems in conversational AI. Although state-of-the-art systems are capable of producing natural-sounding speech, the generated speech often lacks prosodic variation and is not always suited to the task. In this paper, we examine dialogue data collection methods to use as training data for our acoustic models. We collect speech using three different setups: (1) Random read-aloud sentences; (2) Performed dialogues; (3) Semi-Spontaneous dialogues. We analyze prosodic and textual properties of the data collected in these setups and make some recommendations to collect data for speech synthesis in conversational AI settings.Peer reviewe

    CLiFF Notes: Research in the Language Information and Computation Laboratory of The University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLIFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science, Psychology, and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. With 48 individual contributors and six projects represented, this is the largest LINC Lab collection to date, and the most diverse

    Using Muted-Video Enactments to Develop Sociolinguistic Awareness

    Get PDF

    Research in the Language, Information and Computation Laboratory of the University of Pennsylvania

    Get PDF
    This report takes its name from the Computational Linguistics Feedback Forum (CLiFF), an informal discussion group for students and faculty. However the scope of the research covered in this report is broader than the title might suggest; this is the yearly report of the LINC Lab, the Language, Information and Computation Laboratory of the University of Pennsylvania. It may at first be hard to see the threads that bind together the work presented here, work by faculty, graduate students and postdocs in the Computer Science and Linguistics Departments, and the Institute for Research in Cognitive Science. It includes prototypical Natural Language fields such as: Combinatorial Categorial Grammars, Tree Adjoining Grammars, syntactic parsing and the syntax-semantics interface; but it extends to statistical methods, plan inference, instruction understanding, intonation, causal reasoning, free word order languages, geometric reasoning, medical informatics, connectionism, and language acquisition. Naturally, this introduction cannot spell out all the connections between these abstracts; we invite you to explore them on your own. In fact, with this issue it’s easier than ever to do so: this document is accessible on the “information superhighway”. Just call up http://www.cis.upenn.edu/~cliff-group/94/cliffnotes.html In addition, you can find many of the papers referenced in the CLiFF Notes on the net. Most can be obtained by following links from the authors’ abstracts in the web version of this report. The abstracts describe the researchers’ many areas of investigation, explain their shared concerns, and present some interesting work in Cognitive Science. We hope its new online format makes the CLiFF Notes a more useful and interesting guide to Computational Linguistics activity at Penn

    Staging mental discursive processes and reactions: The construction of direct reported thought (DRT) in conversational storytelling

    Get PDF
    This article approaches the construction of reported thought in everyday conversation by analysing instances of direct reported thought (DRT), taken from storytelling sequences. It is argued that DRT is used by narrators as a device to portray, in a dynamic sense, the ways in which they experience the story world in their mind, as discursive processes and reactions around an external event that clash with their expectations or initial perception of the situation. More specifically, the analysis shows that DRT is employed to stage a "first wrong thought" (Jefferson 2004) that is shaped in a range of ways, as a process of worrying, deliberating, lamenting, and blaming or accusing someone in the situation, as well as shocked and indignant reactions that are constructed as exclamations and a process of reproaching and planning a future revenge action. (Direct reported thought, conversational storytelling, mental discursive processes, mental reactions, first wrong thought, silent shock, inner experience, direct reported speech

    Intonation in a text-to-speech conversion system

    Get PDF

    TEACHING ENGLISH USING BOARD-GAME STRATEGY: ITS EFFECT ON STUDENTS’ SPEAKING ABILITY AT MA DAREL HIKMAH PEKANBARU

    Get PDF
    ABSTRACT Ari Saputra, (2019) : Pengajaran Bahasa Inggris Menggunakan Strategi Board Game: dan Pengaruhnya Terhadap Kemampuan Berbicara Siswa di MA Dar El Hikmah Pekanbaru Penelitian ini dilakukan berdasarkan beberapa masalah dalam pelajaran bahasa inggris terutama dalam kemampuan berbicara. Hal ini juga bertujuan untuk mengetahui kemampuan berbicara siswa yang diajarkan dengan menggunakan strategi Board Game, dan kemampuan berbicara siswa yang diajarkan tanpa menggunakan Board Game dan untuk mengetahui apakah ada pengaruh signifikan terhadap penggunaan Board Game pada kemampuan siswa kelas X dalam berbicara di MA Dar El Hikmah Pekanbaru. Jenis penelitian adalah penelitian quasi experimental, peneliti mengambil dua kelas dari lima kelas X sebagai sampel eksperimen dengan berjumlah 20 siswa di dalam kelas, dan kelas kontrol yang juga berjumlah 20 siswa di dalam kelas, dan dilakukan post test untuk memperoleh hasil dari penelitian. Hasil analisis data menunjukkan bahwa ada sebuah pengaruh yang positif terhadap penggunaan Board Game pada kemampuan berbicara siswa, yang mana nilai t-obtain lebih besar daripada t-table baik pada level signifikansi 5% atau 1% (1,68595 2,42857). Oleh karena itu, Ho ditolak dan Ha diterima; atau ada perbedaan yang signifikan dari penggunaan Board Game terhadap kemampuan berbicara siswa di kelas X. Hal ini juga bisa dilihat dari perhitungan dalam effect size yang dihitung menggunakan rumus etta squared dengan hasil 0,31435512 atau dikategorikan large effect. Kata Kunci : Board Game, Kemampuan Berbicar

    Is That a Rhetorical Question?: A Pragmatic Analysis

    Get PDF
    There has been much work on the syntax, semantics, and pragmatics of questions.While the argument herein is that rhetorical questions do not function like typical information-seeking questions, it remains the case that they are, if nothing else, syntactically interrogative. This fact is explored by examining different types of rhetorical questions through various lenses, including question semantics, Gricean pragmatics, and Speech Act Theory. A pragmatic framework is proposed to explain the effects that rhetorical questions have on the conversational scoreboard. Their illocutionary force is also considered, as it, along with contextual factors, can affect how rhetorical questions are interpreted. This paper offers a new definition of RHETORICAL QUESTIONS as well as providing an analysis of their pragmatic effects
    corecore