321 research outputs found

    Tagging Prosody and Discourse Structure in Elicited Spontaneous Speech

    Get PDF
    This paper motivates and describes the annotation and analysis of prosody and discourse structure for several large spoken language corpora. The annotation schema are of two types: tags for prosody and intonation, and tags for several aspects of discourse structure. The choice of the particular tagging schema in each domain is based in large part on the insights they provide in corpus-based studies of the relationship between discourse structure and the accenting of referring expressions in American English. We first describe these results and show that the same models account for the accenting of pronouns in an extended passage from one of the Speech Warehouse hotel-booking dialogues. We then turn to corpora described in Venditti [Ven00], which adapts the same models to Tokyo Japanese. Japanese is interesting to compare to English, because accent is lexically specified and so cannot mark discourse focus in the same way. Analyses of these corpora show that local pitch range expansion serves the analogous focusing function in Japanese. The paper concludes with a section describing several outstanding questions in the annotation of Japanese intonation which corpus studies can help to resolve.Work reported in this paper was supported in part by a grant from the Ohio State University Office of Research, to Mary E. Beckman and co-principal investigators on the OSU Speech Warehouse project, and by an Ohio State University Presidential Fellowship to Jennifer J. Venditti

    On the Multiple Clause Linkage Structure of Japanese: A Corpus-based Study

    Get PDF
    In this paper, we will describe the distribution of the multiple clause linkage structure within actual spoken and written Japanese. We will examine three Japanese corpora: BCCWJ, CSJ and OCOJ. By identifying distributions of multiple clause linkage structures in corpora of contemporary Japanese (BCCWJ and CSJ), we shed light on what kinds of settings give rise to what type of clause linkage structures through what processes. The dynamic rewriting rule proposed by Kondo (2005) is introduced as a model for the incremental production of multiple clause linkage structures. Some common patterns of such structures occurring in Old Japanese are identified by OCOJ and compared to patterns in BCCWJ and CSJ

    A Corpus-Based Comparison of Syntactic Complexity in Spoken and Written Learner Language

    Get PDF
    Despite writing and speaking being related activities, their end-products are entirely different. However, previous studies have not shown consistency in terms of grammar use in these two modes. Accordingly, in the present study, I aim to define the syntactic characteristics in these two modes with large-scale data and organized research designs. This study examined 14 indices of syntactic complexity and specific grammar factors in 224 monologues and 139 writings of Korean EFL undergraduates. The results revealed that learners tended to use more finite complement clauses and relative clauses while writing but used because- fragments independently and โ€˜andโ€™ sentence-initially more frequently while speaking. When compared with previous studies, the characteristics of syntactic complexity of Korean EFL learners, regardless of age, are defined by the use of coordination in speaking and the use of subordination in writing.ย  Lโ€™รฉcrit et lโ€™oral sont des activitรฉs clairement liรฉes, mais le rรฉsultat final est tout ร  fait diffรฉrent.Toutefois, des รฉtudes antรฉrieures n'ont pas montrรฉes de cohรฉrence dans l'utilisation de la grammaire dans les deux modes. Par consรฉquent, dans la prรฉsente รฉtude, le but est de dรฉfinir les caractรฉristiques syntaxiques des deux modes avec des donnรฉes ร  grande รฉchelle et des plans de recherche organisรฉs. Cette รฉtude a examinรฉ 14 indices de complexitรฉ syntaxique et des facteurs grammaticaux spรฉcifiques dans 224 monologues et 139 รฉcrits d'รฉtudiants corรฉens de premier cycle EFL. Les rรฉsultats ont rรฉvรฉlรฉ que les apprenants ont tendance ร  utiliser des clauses complรฉmentaires limitรฉes et des clauses relatives lorsqu'ils รฉcrivent, mais qu'ils utilisent les fragments โ€˜parce queโ€™ de maniรจre indรฉpendante et les fragments โ€˜etโ€™ en dรฉbut de phrase plus frรฉquemment ร  lโ€™oral. En comparaison des รฉtudes prรฉcรฉdentes, les caractรฉristiques de la complexitรฉ syntaxique des apprenants corรฉens de l'EFL, quel que soit leur รขge, sont dรฉfinies par l'utilisation de la conjonction de coordination dans la parole ร  lโ€™oral et de la conjonction de subordination par ร  lโ€™รฉcrit

    The COPLE2 Corpus: a Learner Corpus for Portuguese

    Get PDF
    We present the COPLE2 corpus, a learner corpus of Portuguese that includes written and spoken texts produced by learners of Portuguese as a second or foreign language. The corpus includes at the moment a total of 182,474 tokens and 978 texts, classified according to the CEFR scales. The original handwritten productions are transcribed in TEI compliant XML format and keep record of all the original information, such as reformulations, insertions and corrections made by the teacher, while the recordings are transcribed and aligned with EXMARaLDA. The TEITOK environment enables different views of the same document (XML, student version, corrected version), a CQP-based search interface, the POS, lemmatization and normalization of the tokens, and will soon be used for error annotation in stand-off format. The corpus has already been a source of data for phonological, lexical and syntactic interlanguage studies and will be used for a data-informed selection of language features for each proficiency level.info:eu-repo/semantics/publishedVersio

    Towards error annotation in a learner corpus of Portuguese

    Get PDF
    In this article, we present COPLE2, a new corpus of Portuguese that encompasses written and spoken data produced by foreign learners of Portuguese as a foreign or second language (FL/L2). Following the trend towards learner corpus research applied to less commonly taught languages, it is our aim to enhance the learning data of Portuguese L2. These data may be useful not only for educational purposes (design of learning materials, curricula, etc.) but also for the development of NLP tools to support students in their learning process. The corpus is available online using TEITOK environment, a web-based framework for corpus treatment that provides several built-in NLP tools and a rich set of functionalities (multiple orthographic transcription layers, lemmatization and POS, normalization of the tokens, error annotation) to automatically process and annotate texts in xml format. A CQP-based search interface allows searching the corpus for different fields, such as words, lemmas, POS tags or error tags. We will describe the work in progress regarding the constitution and linguistic annotation of this corpus, particularly focusing on error annotation.info:eu-repo/semantics/publishedVersio

    Coordinating in dialogue: Using compound contributions to join a party

    Get PDF
    PhDCompound contributions (CCs) โ€“ dialogue contributions that continue or complete an earlier contribution โ€“ are an important and common device conversational participants use to extend their own and each otherโ€™s turns. The organisation of these cross-turn structures is one of the defining characteristics of natural dialogue, and cross-person CCs provide the paradigm case of coordination in dialogue. This thesis combines corpus analysis, experiments and theoretical modelling to explore how CCs are used, their effects on coordination and implications for dialogue models. The syntactic and pragmatic distribution of CCs is mapped using corpora of ordinary and task-oriented dialogues. This indicates that the principal factors conditioning the distribution of CCs are pragmatic and that same- and cross-person CCs tend to occur in different contexts. In order to test the impact of CCs on other conversational participants, two experiments are presented. These systematically manipulate, for the first time, the occurrence of CCs in live dialogue using text-based communication. The results suggest that syntax does not directly constrain the interpretation of CCs, and the primary effect of a cross-person CC on third parties is to suggest to them a strong form of coordination or coalition has formed between the people producing the two parts of the CC. A third experiment explores the conditions under which people will produce a completion for a truncated turn. Manipulations of the structural and contextual predictability of the truncated turn show that while syntax provides a resource for the construction of a CC it does not place significant constraints on where the split point may occur. It also shows that people are more likely to produce continuations when they share common ground. An analysis using the Dynamic Syntax framework is proposed, which extends previous work to account for these findings, and limitations and further research possibilities are outlined

    ์Œ์„ฑ์–ธ์–ด ์ดํ•ด์—์„œ์˜ ์ค‘์˜์„ฑ ํ•ด์†Œ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2022. 8. ๊น€๋‚จ์ˆ˜.์–ธ์–ด์˜ ์ค‘์˜์„ฑ์€ ํ•„์—ฐ์ ์ด๋‹ค. ๊ทธ๊ฒƒ์€ ์–ธ์–ด๊ฐ€ ์˜์‚ฌ ์†Œํ†ต์˜ ์ˆ˜๋‹จ์ด์ง€๋งŒ, ๋ชจ๋“  ์‚ฌ๋žŒ์ด ์ƒ๊ฐํ•˜๋Š” ์–ด๋–ค ๊ฐœ๋…์ด ์™„๋ฒฝํžˆ ๋™์ผํ•˜๊ฒŒ ์ „๋‹ฌ๋  ์ˆ˜ ์—†๋Š” ๊ฒƒ์— ๊ธฐ์ธํ•œ๋‹ค. ์ด๋Š” ํ•„์—ฐ์ ์ธ ์š”์†Œ์ด๊ธฐ๋„ ํ•˜์ง€๋งŒ, ์–ธ์–ด ์ดํ•ด์—์„œ ์ค‘์˜์„ฑ์€ ์ข…์ข… ์˜์‚ฌ ์†Œํ†ต์˜ ๋‹จ์ ˆ์ด๋‚˜ ์‹คํŒจ๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ๋„ ํ•œ๋‹ค. ์–ธ์–ด์˜ ์ค‘์˜์„ฑ์—๋Š” ๋‹ค์–‘ํ•œ ์ธต์œ„๊ฐ€ ์กด์žฌํ•œ๋‹ค. ํ•˜์ง€๋งŒ, ๋ชจ๋“  ์ƒํ™ฉ์—์„œ ์ค‘์˜์„ฑ์ด ํ•ด์†Œ๋  ํ•„์š”๋Š” ์—†๋‹ค. ํƒœ์Šคํฌ๋งˆ๋‹ค, ๋„๋ฉ”์ธ๋งˆ๋‹ค ๋‹ค๋ฅธ ์–‘์ƒ์˜ ์ค‘์˜์„ฑ์ด ์กด์žฌํ•˜๋ฉฐ, ์ด๋ฅผ ์ž˜ ์ •์˜ํ•˜๊ณ  ํ•ด์†Œ๋  ์ˆ˜ ์žˆ๋Š” ์ค‘์˜์„ฑ์ž„์„ ํŒŒ์•…ํ•œ ํ›„ ์ค‘์˜์ ์ธ ๋ถ€๋ถ„ ๊ฐ„์˜ ๊ฒฝ๊ณ„๋ฅผ ์ž˜ ์ •ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค. ๋ณธ๊ณ ์—์„œ๋Š” ์Œ์„ฑ ์–ธ์–ด ์ฒ˜๋ฆฌ, ํŠนํžˆ ์˜๋„ ์ดํ•ด์— ์žˆ์–ด ์–ด๋–ค ์–‘์ƒ์˜ ์ค‘์˜์„ฑ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๊ณ , ์ด๋ฅผ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•œ ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ํ˜„์ƒ์€ ๋‹ค์–‘ํ•œ ์–ธ์–ด์—์„œ ๋ฐœ์ƒํ•˜์ง€๋งŒ, ๊ทธ ์ •๋„ ๋ฐ ์–‘์ƒ์€ ์–ธ์–ด์— ๋”ฐ๋ผ์„œ ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์šฐ๋ฆฌ์˜ ์—ฐ๊ตฌ์—์„œ ์ฃผ๋ชฉํ•˜๋Š” ๋ถ€๋ถ„์€, ์Œ์„ฑ ์–ธ์–ด์— ๋‹ด๊ธด ์ •๋ณด๋Ÿ‰๊ณผ ๋ฌธ์ž ์–ธ์–ด์˜ ์ •๋ณด๋Ÿ‰ ์ฐจ์ด๋กœ ์ธํ•ด ์ค‘์˜์„ฑ์ด ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ๋“ค์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์šด์œจ(prosody)์— ๋”ฐ๋ผ ๋ฌธ์žฅ ํ˜•์‹ ๋ฐ ์˜๋„๊ฐ€ ๋‹ค๋ฅด๊ฒŒ ํ‘œํ˜„๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์€ ํ•œ๊ตญ์–ด๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์ง„ํ–‰๋œ๋‹ค. ํ•œ๊ตญ์–ด์—์„œ๋Š” ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์ด ์žˆ๋Š”(multi-functionalํ•œ) ์ข…๊ฒฐ์–ด๋ฏธ(sentence ender), ๋นˆ๋ฒˆํ•œ ํƒˆ๋ฝ ํ˜„์ƒ(pro-drop), ์˜๋ฌธ์‚ฌ ๊ฐ„์„ญ(wh-intervention) ๋“ฑ์œผ๋กœ ์ธํ•ด, ๊ฐ™์€ ํ…์ŠคํŠธ๊ฐ€ ์—ฌ๋Ÿฌ ์˜๋„๋กœ ์ฝํžˆ๋Š” ํ˜„์ƒ์ด ๋ฐœ์ƒํ•˜๊ณค ํ•œ๋‹ค. ์ด๊ฒƒ์ด ์˜๋„ ์ดํ•ด์— ํ˜ผ์„ ์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ๋‹ค๋Š” ๋ฐ์— ์ฐฉ์•ˆํ•˜์—ฌ, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์ค‘์˜์„ฑ์„ ๋จผ์ € ์ •์˜ํ•˜๊ณ , ์ค‘์˜์ ์ธ ๋ฌธ์žฅ๋“ค์„ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ง๋ญ‰์น˜๋ฅผ ๊ตฌ์ถ•ํ•œ๋‹ค. ์˜๋„ ์ดํ•ด๋ฅผ ์œ„ํ•œ ๋ง๋ญ‰์น˜๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๊ณผ์ •์—์„œ ๋ฌธ์žฅ์˜ ์ง€ํ–ฅ์„ฑ(directivity)๊ณผ ์ˆ˜์‚ฌ์„ฑ(rhetoricalness)์ด ๊ณ ๋ ค๋œ๋‹ค. ์ด๊ฒƒ์€ ์Œ์„ฑ ์–ธ์–ด์˜ ์˜๋„๋ฅผ ์„œ์ˆ , ์งˆ๋ฌธ, ๋ช…๋ น, ์ˆ˜์‚ฌ์˜๋ฌธ๋ฌธ, ๊ทธ๋ฆฌ๊ณ  ์ˆ˜์‚ฌ๋ช…๋ น๋ฌธ์œผ๋กœ ๊ตฌ๋ถ„ํ•˜๊ฒŒ ํ•˜๋Š” ๊ธฐ์ค€์ด ๋œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ธฐ๋ก๋œ ์Œ์„ฑ ์–ธ์–ด(spoken language)๋ฅผ ์ถฉ๋ถ„ํžˆ ๋†’์€ ์ผ์น˜๋„(kappa = 0.85)๋กœ ์ฃผ์„ํ•œ ๋ง๋ญ‰์น˜๋ฅผ ์ด์šฉํ•ด, ์Œ์„ฑ์ด ์ฃผ์–ด์ง€์ง€ ์•Š์€ ์ƒํ™ฉ์—์„œ ์ค‘์˜์ ์ธ ํ…์ŠคํŠธ๋ฅผ ๊ฐ์ง€ํ•˜๋Š” ๋ฐ์— ์–ด๋–ค ์ „๋žต ํ˜น์€ ์–ธ์–ด ๋ชจ๋ธ์ด ํšจ๊ณผ์ ์ธ๊ฐ€๋ฅผ ๋ณด์ด๊ณ , ํ•ด๋‹น ํƒœ์Šคํฌ์˜ ํŠน์ง•์„ ์ •์„ฑ์ ์œผ๋กœ ๋ถ„์„ํ•œ๋‹ค. ๋˜ํ•œ, ์šฐ๋ฆฌ๋Š” ํ…์ŠคํŠธ ์ธต์œ„์—์„œ๋งŒ ์ค‘์˜์„ฑ์— ์ ‘๊ทผํ•˜์ง€ ์•Š๊ณ , ์‹ค์ œ๋กœ ์Œ์„ฑ์ด ์ฃผ์–ด์ง„ ์ƒํ™ฉ์—์„œ ์ค‘์˜์„ฑ ํ•ด์†Œ(disambiguation)๊ฐ€ ๊ฐ€๋Šฅํ•œ์ง€๋ฅผ ์•Œ์•„๋ณด๊ธฐ ์œ„ํ•ด, ํ…์ŠคํŠธ๊ฐ€ ์ค‘์˜์ ์ธ ๋ฐœํ™”๋“ค๋งŒ์œผ๋กœ ๊ตฌ์„ฑ๋œ ์ธ๊ณต์ ์ธ ์Œ์„ฑ ๋ง๋ญ‰์น˜๋ฅผ ์„ค๊ณ„ํ•˜๊ณ  ๋‹ค์–‘ํ•œ ์ง‘์ค‘(attention) ๊ธฐ๋ฐ˜ ์‹ ๊ฒฝ๋ง(neural network) ๋ชจ๋ธ๋“ค์„ ์ด์šฉํ•ด ์ค‘์˜์„ฑ์„ ํ•ด์†Œํ•œ๋‹ค. ์ด ๊ณผ์ •์—์„œ ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ํ†ต์‚ฌ์ /์˜๋ฏธ์  ์ค‘์˜์„ฑ ํ•ด์†Œ๊ฐ€ ์–ด๋– ํ•œ ๊ฒฝ์šฐ์— ๊ฐ€์žฅ ํšจ๊ณผ์ ์ธ์ง€ ๊ด€์ฐฐํ•˜๊ณ , ์ธ๊ฐ„์˜ ์–ธ์–ด ์ฒ˜๋ฆฌ์™€ ์–ด๋–ค ์—ฐ๊ด€์ด ์žˆ๋Š”์ง€์— ๋Œ€ํ•œ ๊ด€์ ์„ ์ œ์‹œํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋งˆ์ง€๋ง‰์œผ๋กœ, ์œ„์™€ ๊ฐ™์€ ์ ˆ์ฐจ๋กœ ์˜๋„ ์ดํ•ด ๊ณผ์ •์—์„œ์˜ ์ค‘์˜์„ฑ์ด ํ•ด์†Œ๋˜์—ˆ์„ ๊ฒฝ์šฐ, ์ด๋ฅผ ์–ด๋–ป๊ฒŒ ์‚ฐ์—…๊ณ„ ํ˜น์€ ์—ฐ๊ตฌ ๋‹จ์—์„œ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ๋Š”๊ฐ€์— ๋Œ€ํ•œ ๊ฐ„๋žตํ•œ ๋กœ๋“œ๋งต์„ ์ œ์‹œํ•œ๋‹ค. ํ…์ŠคํŠธ์— ๊ธฐ๋ฐ˜ํ•œ ์ค‘์˜์„ฑ ํŒŒ์•…๊ณผ ์Œ์„ฑ ๊ธฐ๋ฐ˜์˜ ์˜๋„ ์ดํ•ด ๋ชจ๋“ˆ์„ ํ†ตํ•ฉํ•œ๋‹ค๋ฉด, ์˜ค๋ฅ˜์˜ ์ „ํŒŒ๋ฅผ ์ค„์ด๋ฉด์„œ๋„ ํšจ์œจ์ ์œผ๋กœ ์ค‘์˜์„ฑ์„ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋Š” ์‹œ์Šคํ…œ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค. ์ด๋Ÿฌํ•œ ์‹œ์Šคํ…œ์€ ๋Œ€ํ™” ๋งค๋‹ˆ์ €(dialogue manager)์™€ ํ†ตํ•ฉ๋˜์–ด ๊ฐ„๋‹จํ•œ ๋Œ€ํ™”(chit-chat)๊ฐ€ ๊ฐ€๋Šฅํ•œ ๋ชฉ์  ์ง€ํ–ฅ ๋Œ€ํ™” ์‹œ์Šคํ…œ(task-oriented dialogue system)์„ ๊ตฌ์ถ•ํ•  ์ˆ˜๋„ ์žˆ๊ณ , ๋‹จ์ผ ์–ธ์–ด ์กฐ๊ฑด(monolingual condition)์„ ๋„˜์–ด ์Œ์„ฑ ๋ฒˆ์—ญ์—์„œ์˜ ์—๋Ÿฌ๋ฅผ ์ค„์ด๋Š” ๋ฐ์— ํ™œ์šฉ๋  ์ˆ˜๋„ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋ณธ๊ณ ๋ฅผ ํ†ตํ•ด, ์šด์œจ์— ๋ฏผ๊ฐํ•œ(prosody-sensitive) ์–ธ์–ด์—์„œ ์˜๋„ ์ดํ•ด๋ฅผ ์œ„ํ•œ ์ค‘์˜์„ฑ ํ•ด์†Œ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋ฉฐ, ์ด๋ฅผ ์‚ฐ์—… ๋ฐ ์—ฐ๊ตฌ ๋‹จ์—์„œ ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ด๊ณ ์ž ํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๊ฐ€ ๋‹ค๋ฅธ ์–ธ์–ด ๋ฐ ๋„๋ฉ”์ธ์—์„œ๋„ ๊ณ ์งˆ์ ์ธ ์ค‘์˜์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด์†Œํ•˜๋Š” ๋ฐ์— ๋„์›€์ด ๋˜๊ธธ ๋ฐ”๋ผ๋ฉฐ, ์ด๋ฅผ ์œ„ํ•ด ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•˜๋Š” ๋ฐ์— ํ™œ์šฉ๋œ ๋ฆฌ์†Œ์Šค, ๊ฒฐ๊ณผ๋ฌผ ๋ฐ ์ฝ”๋“œ๋“ค์„ ๊ณต์œ ํ•จ์œผ๋กœ์จ ํ•™๊ณ„์˜ ๋ฐœ์ „์— ์ด๋ฐ”์ง€ํ•˜๊ณ ์ž ํ•œ๋‹ค.Ambiguity in the language is inevitable. It is because, albeit language is a means of communication, a particular concept that everyone thinks of cannot be conveyed in a perfectly identical manner. As this is an inevitable factor, ambiguity in language understanding often leads to breakdown or failure of communication. There are various hierarchies of language ambiguity. However, not all ambiguity needs to be resolved. Different aspects of ambiguity exist for each domain and task, and it is crucial to define the boundary after recognizing the ambiguity that can be well-defined and resolved. In this dissertation, we investigate the types of ambiguity that appear in spoken language processing, especially in intention understanding, and conduct research to define and resolve it. Although this phenomenon occurs in various languages, its degree and aspect depend on the language investigated. The factor we focus on is cases where the ambiguity comes from the gap between the amount of information in the spoken language and the text. Here, we study the Korean language, which often shows different sentence structures and intentions depending on the prosody. In the Korean language, a text is often read with multiple intentions due to multi-functional sentence enders, frequent pro-drop, wh-intervention, etc. We first define this type of ambiguity and construct a corpus that helps detect ambiguous sentences, given that such utterances can be problematic for intention understanding. In constructing a corpus for intention understanding, we consider the directivity and rhetoricalness of a sentence. They make up a criterion for classifying the intention of spoken language into a statement, question, command, rhetorical question, and rhetorical command. Using the corpus annotated with sufficiently high agreement on a spoken language corpus, we show that colloquial corpus-based language models are effective in classifying ambiguous text given only textual data, and qualitatively analyze the characteristics of the task. We do not handle ambiguity only at the text level. To find out whether actual disambiguation is possible given a speech input, we design an artificial spoken language corpus composed only of ambiguous sentences, and resolve ambiguity with various attention-based neural network architectures. In this process, we observe that the ambiguity resolution is most effective when both textual and acoustic input co-attends each feature, especially when the audio processing module conveys attention information to the text module in a multi-hop manner. Finally, assuming the case that the ambiguity of intention understanding is resolved by proposed strategies, we present a brief roadmap of how the results can be utilized at the industry or research level. By integrating text-based ambiguity detection and speech-based intention understanding module, we can build a system that handles ambiguity efficiently while reducing error propagation. Such a system can be integrated with dialogue managers to make up a task-oriented dialogue system capable of chit-chat, or it can be used for error reduction in multilingual circumstances such as speech translation, beyond merely monolingual conditions. Throughout the dissertation, we want to show that ambiguity resolution for intention understanding in prosody-sensitive language can be achieved and can be utilized at the industry or research level. We hope that this study helps tackle chronic ambiguity issues in other languages โ€‹โ€‹or other domains, linking linguistic science and engineering approaches.1 Introduction 1 1.1 Motivation 2 1.2 Research Goal 4 1.3 Outline of the Dissertation 5 2 Related Work 6 2.1 Spoken Language Understanding 6 2.2 Speech Act and Intention 8 2.2.1 Performatives and statements 8 2.2.2 Illocutionary act and speech act 9 2.2.3 Formal semantic approaches 11 2.3 Ambiguity of Intention Understanding in Korean 14 2.3.1 Ambiguities in language 14 2.3.2 Speech act and intention understanding in Korean 16 3 Ambiguity in Intention Understanding of Spoken Language 20 3.1 Intention Understanding and Ambiguity 20 3.2 Annotation Protocol 23 3.2.1 Fragments 24 3.2.2 Clear-cut cases 26 3.2.3 Intonation-dependent utterances 28 3.3 Data Construction . 32 3.3.1 Source scripts 32 3.3.2 Agreement 32 3.3.3 Augmentation 33 3.3.4 Train split 33 3.4 Experiments and Results 34 3.4.1 Models 34 3.4.2 Implementation 36 3.4.3 Results 37 3.5 Findings and Summary 44 3.5.1 Findings 44 3.5.2 Summary 45 4 Disambiguation of Speech Intention 47 4.1 Ambiguity Resolution 47 4.1.1 Prosody and syntax 48 4.1.2 Disambiguation with prosody 50 4.1.3 Approaches in SLU 50 4.2 Dataset Construction 51 4.2.1 Script generation 52 4.2.2 Label tagging 54 4.2.3 Recording 56 4.3 Experiments and Results 57 4.3.1 Models 57 4.3.2 Results 60 4.4 Summary 63 5 System Integration and Application 65 5.1 System Integration for Intention Identification 65 5.1.1 Proof of concept 65 5.1.2 Preliminary study 69 5.2 Application to Spoken Dialogue System 75 5.2.1 What is 'Free-running' 76 5.2.2 Omakase chatbot 76 5.3 Beyond Monolingual Approaches 84 5.3.1 Spoken language translation 85 5.3.2 Dataset 87 5.3.3 Analysis 94 5.3.4 Discussion 95 5.4 Summary 100 6 Conclusion and Future Work 103 Bibliography 105 Abstract (In Korean) 124 Acknowledgment 126๋ฐ•

    Defective connective constructions: Some cases in Catalan and Spanish

    Get PDF
    Connectives typically relate two content units. However, corpus analysis shows several variants of the general connective construction (i.e., 'S1 Cn S2'), in which one of either segment 1 (S1) or segment 2 (S2) is optional or missing. The aim of this paper is to shed some light on the description of some variants of the connective construction where the connective is not followed by any explicit S2 or S2 is optional. These constructions are complete utterances but they can be considered defective constructions, since one of the slots of the prototypical construction does not include any linguistic material. The analysis focuses on corpus examples including a refutation marker where S2 is implicit, a case that is especially productive and varied in Catalan and in Spanish. Three defective constructions are identified, namely, (i) truncated constructions, (ii) embedded uses of a connective and (ii) reactive constructions. The data show that these defective connective constructions differ as for syntax, prosody, semantics and pragmatics. In monologic contexts, when the second segment is missing in the syntactic and prosodic unit considered, the connective is syntactically and prosodically related to S1. The connective can be located at the right-periphery of S1 (truncated construction) or at S1 middle field (embedded use of a connective). In dialogic contexts, the connective can act as a response to a previous turn and S2 can be either present or absent (reactive constructions). The different configurations match different intonation contours and pause patterns. In all cases, the connective weakens its connective function and adds a modal load, related to (inter)subjectification and intensification. This can be represented as a cline from discourse marking to modal marking
    • โ€ฆ
    corecore