726 research outputs found

    Phonetic variability and grammatical knowledge: an articulatory study of Korean place assimilation.

    Get PDF
    The study reported here uses articulatory data to investigate Korean place assimilation of coronal stops followed by labial or velar stops, both within words and across words. The results show that this place-assimilation process is highly variable, both within and across speakers, and is also sensitive to factors such as the place of articulation of the following consonant, the presence of a word boundary and, to some extent, speech rate. Gestures affected by the process are generally reduced categorically (deleted), while sporadic gradient reduction of gestures is also observed. We further compare the results for coronals to our previous findings on the assimilation of labials, discussing implications of the results for grammatical models of phonological/phonetic competence. The results suggest that speakers’ language-particular knowledge of place assimilation has to be relatively detailed and context-sensitive, and has to encode systematic regularities about its obligatory/variable application as well as categorical/gradient realisation

    Speech Communication

    Get PDF
    Contains research objectives, summary of research and reports on three research projects.U. S. Navy - Office of Naval Research (Contract N00014-67-A-0204-0064)U. S. Navy - Office of Naval Research (Contract N00014-67-A-0204-0069)National Science Foundation (Grant GK-31353)National Institutes of Health (Grant 5 RO1 NS04332-10)Joint Services Electronics Programs (U. S. Army, U. S. Navy, and U. S. Air Force) under Contract DAAB07-71-C-0300Bell Telephone Laboratories Fellowshi

    Communication with diminutives to young children vs. pets in German, Italian, Lithuanian, Russian, and English

    Get PDF
    This contribution is dedicated to Steven Gillis with whom we have collaborated since the nineties within the “Crosslinguistic Project on Pre- and Protomorphology in Language Acquisition” on both child speech (CS) and child-directed speech (CDS) and also about the development of diminutives (DIMs). We investigate parallels in the use of DIMs and of hypocoristics (HYPs) between CDS and pet-directed speech (PDS), whereas CS is only marginally dealt with. When relevant, also adult-directed speech (ADS), written or oral (especially from electronic corpora, wherever available) will be compared. The presuppositions of this investigation will be stated at the beginning of the Introduction (§ 1). This involves several innovations (beyond descriptions of new data), when compared with existing literature, relevant to theoretical and typological problem areas. We will show that also in DIMs and HYPs used in CDS and PDS semantics only plays a partial or even marginal role when using more DIMs to communicate with young children and young and/or small pets, because it is more relevant that both younger and smaller pets are emotionally closer to us, which is again a pragmatic factor. In regard to language typology, we will apply our concepts of morphological richness and productivity, as argued for and supported in our previous publications, to CDS and PDS and show that richer and more productive patterns of DIM formation of a language also have a typological impact on more frequent and more productive use both in CDS and PDS. We will also apply our concepts of grading morphosemantic transparency/opacity, as argued for and supported in our previous publications, and we start to show, as al- ready shown for CS, that also in CDS towards young children (and similarly in PDS) more morphosemantically transparent DIMs are used than in ADS. This is also connected to their predominantly pragmatic meanings in CDS and PDS (obviously not exclusively pragmatic as in early CS). The languages and authors were selected according to who among the participants in the Crosslinguistic Project on Pre- and Protomorphology in Language Acquisition had CDS and PDS available, plus Elisa Mattiello who has collected English and Italian PDS data.Dit artikel gaat over het gebruik van verkleinwoorden en koosnamen (hypocoristics) in twee taalregisters: taal gericht tot kinderen (child-directed speech, CDS) en taal gericht tot huisdieren (pet-directed speech, PDS). De semantiek van verkleinwoorden blijkt een minder grote rol te spelen dan de pragmatiek: de emotionele nabijheid van kinderen en huisdieren. De studie, waarin vijf talen worden vergeleken, verkent ook de typolo- gie: de morfologische rijkdom van verkleinwoorden in een taal beïnvloedt de produc- tie.Daarnaast speelt de semantische transparantie van verkleinwoorden crosslinguïs- tisch een rol. In CDS en PDS worden meer transparante verkleinwoorden gebruikt

    Developing an Affect-Aware Rear-Projected Robotic Agent

    Get PDF
    Social (or Sociable) robots are designed to interact with people in a natural and interpersonal manner. They are becoming an integrated part of our daily lives and have achieved positive outcomes in several applications such as education, health care, quality of life, entertainment, etc. Despite significant progress towards the development of realistic social robotic agents, a number of problems remain to be solved. First, current social robots either lack enough ability to have deep social interaction with human, or they are very expensive to build and maintain. Second, current social robots have yet to reach the full emotional and social capabilities necessary for rich and robust interaction with human beings. To address these problems, this dissertation presents the development of a low-cost, flexible, affect-aware rear-projected robotic agent (called ExpressionBot), that is designed to support verbal and non-verbal communication between the robot and humans, with the goal of closely modeling the dynamics of natural face-to-face communication. The developed robotic platform uses state-of-the-art character animation technologies to create an animated human face (aka avatar) that is capable of showing facial expressions, realistic eye movement, and accurate visual speech, and then project this avatar onto a face-shaped translucent mask. The mask and the projector are then rigged onto a neck mechanism that can move like a human head. Since an animation is projected onto a mask, the robotic face is highly flexible research tool, mechanically simple, and low-cost to design, build and maintain compared with mechatronic and android faces. The results of our comprehensive Human-Robot Interaction (HRI) studies illustrate the benefits and values of the proposed rear-projected robotic platform over a virtual-agent with the same animation displayed on a 2D computer screen. The results indicate that ExpressionBot is well accepted by users, with some advantages in expressing facial expressions more accurately and perceiving mutual eye gaze contact. To improve social capabilities of the robot and create an expressive and empathic social agent (affect-aware) which is capable of interpreting users\u27 emotional facial expressions, we developed a new Deep Neural Networks (DNN) architecture for Facial Expression Recognition (FER). The proposed DNN was initially trained on seven well-known publicly available databases, and obtained significantly better than, or comparable to, traditional convolutional neural networks or other state-of-the-art methods in both accuracy and learning time. Since the performance of the automated FER system highly depends on its training data, and the eventual goal of the proposed robotic platform is to interact with users in an uncontrolled environment, a database of facial expressions in the wild (called AffectNet) was created by querying emotion-related keywords from different search engines. AffectNet contains more than 1M images with faces and 440,000 manually annotated images with facial expressions, valence, and arousal. Two DNNs were trained on AffectNet to classify the facial expression images and predict the value of valence and arousal. Various evaluation metrics show that our deep neural network approaches trained on AffectNet can perform better than conventional machine learning methods and available off-the-shelf FER systems. We then integrated this automated FER system into spoken dialog of our robotic platform to extend and enrich the capabilities of ExpressionBot beyond spoken dialog and create an affect-aware robotic agent that can measure and infer users\u27 affect and cognition. Three social/interaction aspects (task engagement, being empathic, and likability of the robot) are measured in an experiment with the affect-aware robotic agent. The results indicate that users rated our affect-aware agent as empathic and likable as a robot in which user\u27s affect is recognized by a human (WoZ). In summary, this dissertation presents the development and HRI studies of a perceptive, and expressive, conversational, rear-projected, life-like robotic agent (aka ExpressionBot or Ryan) that models natural face-to-face communication between human and emapthic agent. The results of our in-depth human-robot-interaction studies show that this robotic agent can serve as a model for creating the next generation of empathic social robots

    The Development Of Glide Deletion In Seoul Korean: A Corpus And Articulatory Study

    Get PDF
    This dissertation investigates the pathways and causes of the development of glide deletion in Seoul Korean. Seoul provides fertile ground for studies of linguistic innovation in an urban setting since it has seen rapid historical, social and demographic changes in the twentieth century. The phenomenon under investigation is the variable deletion of the labiovelar glide /w/ found to be on the rise in Seoul Korean (Silva, 1991; Kang, 1997). I present two studies addressing variation and change at two different levels: a corpus study tracking the development of /w/-deletion at the phonological level and an articulatory study examining the phonetic aspect of this change. The corpus data are drawn from the sociolinguistic interviews with 48 native Seoul Koreans between 2015 and 2017. A trend comparison with the data from an earlier study of /w/- deletion (Kang, 1997) reveals that /w/-deletion in postconsonantal position has begun to retreat, while non-postconsonantal /w/-deletion has been rising vigorously. More importantly, the effect of preceding segment that used to be the strongest constraint on /w/-deletion has weakened over time. I conclude that /w/-deletion in Seoul Korean is being reanalyzed with the structural details being diluted over time. I analyze this weakening of the original pattern as the result of linguistic diffusion induced by a great influx of migrants into Seoul after the Korean War (1950-1953). In an articulatory study, ultrasound data of tongue movements and video data of lip rounding for the production of /w/ for three native Seoul Koreans in their 20s, 30s and 50s were analyzed using Optical Flow Analysis. I find that /w/ in Seoul Korean is subject to both gradient reduction and categorical deletion and that younger speakers exhibit a significantly larger articulatory gestures for /w/ after a bilabial than older generation, which is consistent with the pattern of phonological change found in the corpus study. This dissertation demonstrates the importance of using both corpus and articulatory data in the investigation of a change, finding the coexistence of gradient and categorical effects in segmental deletion processes. Finally, it advances our understanding of the outcome of migration-induced dialect contact in contemporary urban settings

    Gradient Activation of Speech Categories Facilitates Listeners’ Recovery From Lexical Garden Paths, But Not Perception of Speech-in-Noise

    Get PDF
    Published 2021 AprListeners activate speech-sound categories in a gradient way, and this information is maintained and affects activation of items at higher levels of processing (McMurray et al., 2002; Toscano et al., 2010). Recent findings by Kapnoula et al. (2017) suggest that the degree to which listeners maintain within-category information varies across individuals. Here we assessed the consequences of this gradiency for speech perception. To test this, we collected a measure of gradiency for different listeners using the visual analogue scaling (VAS) task used by Kapnoula et al. (2017). We also collected 2 independent measures of performance in speech perception: a visual world paradigm (VWP) task measuring participants’ ability to recover from lexical garden paths (McMurray et al., 2009) and a speech-perception task measuring participants’ perception of isolated words in noise. Our results show that categorization gradiency does not predict participants’ performance in the speech-in-noise task. However, higher gradiency predicted higher likelihood of recovery from temporarily misleading information presented in the VWP task. These results suggest that gradient activation of speech sound categories is helpful when listeners need to reconsider their initial interpretation of the input, making them more efficient in recovering from errors.This project was supported by National Institutes of Health Grant DC008089 awarded to Bob McMurray. This work was partially supported by the Basque Government through the BERC 2018-2021 Program and by the Spanish State Research Agency through BCBL Severo Ochoa excellence accreditation SEV-2015-0490. This project was partially supported by the Spanish Ministry of Economy and Competitiveness (MINECO) through the convocatoria 2016 Subprograma Estatal Ayudas para contratos para la Formación Posdoctoral 2016, Programa Estatal de Promoción del Talento y su Empleabilidad del Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016, reference FJCI-2016-28019 awarded to Efthymia C. Kapnoula. This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie Grant 793919, awarded to Efthymia C. Kapnoula

    The relation between acoustic and articulatory variation in vowels : data from American and Australian English

    Get PDF
    In studies of dialect variation, the articulatory nature of vowels is sometimes inferred from formant values using the following heuristic: F1 is inversely correlated with tongue height and F2 is inversely correlated with tongue backness. This study compared vowel formants and corresponding lingual articulation in two dialects of English, standard North American English and Australian English. Five speakers of North American English and four speakers of Australian English were recorded producing multiple repetitions of ten monophthongs embedded in the /sVd/ context. Simultaneous articulatory data were collected using electromagnetic articulography. Results show that there are significant correlations between tongue position and formants in the direction predicted by the heuristic but also that the relations implied by the heuristic break down under specific conditions. Articulatory vowel spaces, based on tongue dorsum (TD) position, and acoustic vowel spaces, based on formants, show systematic misalignment due in part to the influence of other articulatory factors, including lip rounding and tongue curvature on formant values. Incorporating these dimensions into our dialect comparison yields a richer description and a more robust understanding of how vowel formant patterns are reproduced within and across dialects

    Dutch and German 3-year-olds’ representations of voicing alternations

    Get PDF
    The voicing contrast is neutralised syllable and word finally in Dutch and German, leading to alternations within the morphological paradigm (e.g. Dutch ‘bed(s)’, be[t] be[d]en, German ‘dog(s)’, Hun[t]-Hun[d]e). Despite structural similarity, language-specific morphological, phonological and lexical properties impact on the distribution of this alternation in the two languages. Previous acquisition research has focused on one language only, predominantly focusing on children’s production accuracy, concluding that alternations are not acquired until late in the acquisition process in either language. This paper adapts a perceptual method to investigate how voicing alternations are represented in the mental lexicon of Dutch and German 3-year-olds. Sensitivity to mispronunciations of voicing word-medially in plural forms was measured using a visual fixation procedure. Dutch children exhibited evidence of overgeneralising the voicing alternation, whereas German children consistently preferred the correct pronunciation to mispronunciations. Results indicate that the acquisition of voicing alternations is influenced by language-specific factors beyond the alternation itself

    Predicting while comprehending language:A theory and review

    Get PDF
    Researchers agree that comprehenders regularly predict upcoming language, but they do not always agree on what prediction is (and how to differentiate it from integration) or what constitutes evidence for it. After defining prediction, we show that it occurs at all linguistic levels from semantics to form, and then propose a theory of which mechanisms comprehenders use to predict. We argue that they most effectively predict using their production system (i.e., prediction-by-production): They covertly imitate the linguistic form of the speaker’s utterance and construct a representation of the underlying communicative intention. Comprehenders can then run this intention through their own production system to prepare the predicted utterance. But doing so takes time and resources, and comprehenders vary in the extent of preparation, with many groups of comprehenders (non-native speakers, illiterates, children, and older adults) using it less than typical native young adults. We thus argue that prediction-by-production is an optional mechanism, which is augmented by mechanisms based on association. Support for our proposal comes from many areas of research (electrophysiological, eye-tracking, and behavioral studies of reading, spoken language processing in the context of visual environments, speech processing, and dialogue)
    corecore