Search CORE

269 research outputs found

Text Segmentation Similarity Revisited: A Flexible Distance-based Approach for Multiple Boundary Types

Author: Lai Ryan Ka Yau
Li Yujie
Zhang Shujie
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/06/2023
Field of study

Segmentation of texts into discourse and prosodic units is a ubiquitous problem in corpus linguistics and psycholinguistics, yet best practices for its evaluation – whether evaluating consistency between human segmenters or humanlikeness of machine segmenters – remain understudied. Building on segmentation edit distance (Fournier & Inkpen 2012, Fournier 2013), this paper introduces a new measure for evaluating similarity between two segmentations of the same text with multiple, mutually exclusive boundary types, accounting for varying identifiability and confusability between these types. We implement a dynamic programming algorithm for calculation specifically geared towards this type of segmentation problem, apply it to a case study of intonation unit segmentation measuring inter-annotator agreement, and make suggestions for interpreting results

Processing Units in Conversation: A Comparative Study of French and Mandarin Data

Author
Publication venue: 'SAGE Publications'
Publication date
Field of study

Text segmentation with topic modeling and entity coherence

Author: Boella Guido
Di Caro Luigi
John Adebayo Kolawole
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Final Vowel Devoicing in Blackfoot

Author: Prins Samantha Leigh
Publication venue: University of Montana, Maureen and Mike Mansfield Library
Publication date: 01/01/2019
Field of study

This thesis presents a study of final vowel devoicing in Blackfoot, an indigenous language of Montana and Alberta. Previous research on final vowel devoicing in Blackfoot variously suggests word-final, phrase-final, and utterance-final vowel devoicing processes (e.g. Taylor 1965, Bliss & Gick 2009, Frantz 2017), though, the conditioning environment for this phenomenon had not been a research focus prior to this study. The present study investigates intonation units (IUs) as the conditioning domain for final vowel devoicing in Blackfoot. Final vowel devoicing in Blackfoot is investigated here by examining the common word-final suffixes –wa (3SG.AN) and –yi (4SG) in two recordings of connected speech. Each recording features a different native speaker of Blackfoot. Speakers were asked to generate a narrative to go along with illustrations in a picture book. These recordings are interlinearized using ELAN annotation software. Next, tokens of –wa and –yi are analyzed acoustically using Praat phonetic software. Then, –wa and –yi tokens are analyzed in terms of their position within the intonation unit (IU-medial or IU-final). Finally, the data are collated, giving the frequencies of different phonetic variants as well as the distribution of phonetic variants across IU-medial and IU-final environments. The findings of this study are that fully-audible variants of –wa and –yi almost always occur IU-medially, while devoiced variants are most frequently found in IU-final position. Based on these findings, this thesis proposes an IU-final vowel devoicing rule to describe the phonetic variation and distribution of –wa and –yi in connected speech. The analysis put forth in this thesis has implications for the theoretical classification of vowel devoicing phenomena, for linguistic research methodologies, and for the typology of intonation units cross-linguistically. Furthermore, the findings of this work bear on language documentation, revitalization, and pedagogy

Exploring the influence of suprasegmental features of speech on rater judgements of intelligibility

Author: Rogers Thomas Michael
Publication venue: University of Bedfordshire
Publication date: 01/01/2018
Field of study

A thesis submitted to the University of Bedfordshire in partial fulfilment of the requirements for the degree of Doctor of PhilosophyThe importance of suprasegmental features of speech to pronunciation proficiency is well known, yet limited research has been undertaken to identify how raters attend to suprasegmental features in the English-language speaking test encounter. Currently, such features appear to be underrepresented in language learning frameworks and are not always satisfactorily incorporated into the analytical rating scales that are used by major language testing organisations. This thesis explores the influence of lexical stress, rhythm and intonation on rater decision making in order to provide insight into their proper place in rating scales and frameworks. Data were collected from 30 raters, half of whom were experienced professional raters and half of whom lacked rater training and a background in language learning or teaching. The raters were initially asked to score 12 test taker performances using a 9-point intelligibility scale. The performances were taken from the long turn of Cambridge English Main Suite exams and were selected on the basis of the inclusion of a range of notable suprasegmental features. Following scoring, the raters took part in a stimulated recall procedure to report the features that influenced their decisions. The resulting scores were quantitatively analysed using many-facet Rasch measurement analysis. Transcriptions of the verbal reports were analysed using qualitative methods. Finally, an integrated analysis of the quantitative and qualitative data was undertaken to develop a series of suprasegmental rating scale descriptors. The results showed that experienced raters do appear to attend to specific suprasegmental features in a reliable way, and that their decisions have a great deal in common with the way non-experienced raters regard such features. This indicates that stress, rhythm, and intonation may be somewhat underrepresented on current speaking proficiency scales and frameworks. The study concludes with the presentation of a series of suprasegmental rating scale descriptors

Directional adposition use in English, Swedish and Finnish

Author: van der Zee Emile
Walker Crystal
Publication venue: International Cognitive Linguistics Association
Publication date: 21/06/2010
Field of study

Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellä (in front of) and jäljessä (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003). When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellä (in front of) and jäljessä (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected. We asked native English, Swedish and Finnish speakers’ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers. All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion. We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion. Levinson, S. C. (1996). Frames of reference and Molyneux’s question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press. Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press. Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo

Extracting Information from Spoken User Input:A Machine Learning Approach

Author: Lendvai P.K.
Publication venue: [n.n.]
Publication date: 01/01/2004
Field of study

We propose a module that performs automatic analysis of user input in spoken dialogue systems using machine learning algorithms. The input to the module is material received from the speech recogniser and the dialogue manager of the spoken dialogue system, the output is a four-level pragmatic-semantic representation of the user utterance. Our investigation shows that when the four interpretation levels are combined in a complex machine learning task, the performance of the module is significantly better than the score of an informed baseline strategy. However, via a systematic, automatised search for the optimal subtask combinations we can gain substantial improvement produced by both classifiers for all four interpretation subtasks. A case study is conducted on dialogues between an automatised, experimental system that gives information on the phone about train connections in the Netherlands, and its users who speak in Dutch. We find that drawing on unsophisticated, potentially noisy features that characterise the dialogue situation, and by performing automatic optimisation of the formulated machine learning task it is possible to extract sophisticated information of practical pragmatic-semantic value from spoken user input with robust performance. This means that our module can with a good score interpret whether the user of the system is giving slot-filling information, and for which query slots (e.g., departure station, departure time, etc.), whether the user gave a positive or a negative answer to the system, or whether the user signals that there are problems in the interaction.

Tilburg University Repository