578 research outputs found

    Children\u27s Sensitivity to Pitch Variation in Language

    Get PDF
    Children acquire consonant and vowel categories by 12 months, but take much longer to learn to interpret perceptible variation. This dissertation considers children’s interpretation of pitch variation. Pitch operates, often simultaneously, at different levels of linguistic structure. English-learning children must disregard pitch at the lexical level—since English is not a tone language—while still attending to pitch for its other functions. Chapters 1 and 5 outline the learning problem and suggest ways children might solve it. Chapter 2 demonstrates that 2.5-year-olds know pitch cannot differentiate words in English. Chapter 3 finds that not until age 4–5 do children correctly interpret pitch cues to emotions. Chapter 4 demonstrates some sensitivity between 2.5 and 5 years to the pitch cue to lexical stress, but continuing difficulties at the older ages. These findings suggest a late trajectory for interpretation of prosodic variation; throughout, I propose explanations for this protracted time-course

    RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars

    Full text link
    Synthesizing high-fidelity head avatars is a central problem for computer vision and graphics. While head avatar synthesis algorithms have advanced rapidly, the best ones still face great obstacles in real-world scenarios. One of the vital causes is inadequate datasets -- 1) current public datasets can only support researchers to explore high-fidelity head avatars in one or two task directions; 2) these datasets usually contain digital head assets with limited data volume, and narrow distribution over different attributes. In this paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive advance in head avatar research. It contains massive data assets, with 243+ million complete head frames, and over 800k video sequences from 500 different identities captured by synchronized multi-view cameras at 30 FPS. It is a large-scale digital library for head avatars with three key attributes: 1) High Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K cameras in 360 degrees. 2) High Diversity: The collected subjects vary from different ages, eras, ethnicities, and cultures, providing abundant materials with distinctive styles in appearance and geometry. Moreover, each subject is asked to perform various motions, such as expressions and head rotations, which further extend the richness of assets. 3) Rich Annotations: we provide annotations with different granularities: cameras' parameters, matting, scan, 2D/3D facial landmarks, FLAME fitting, and text description. Based on the dataset, we build a comprehensive benchmark for head avatar research, with 16 state-of-the-art methods performed on five main tasks: novel view synthesis, novel expression synthesis, hair rendering, hair editing, and talking head generation. Our experiments uncover the strengths and weaknesses of current methods. RenderMe-360 opens the door for future exploration in head avatars.Comment: Technical Report; Project Page: 36; Github Link: https://github.com/RenderMe-360/RenderMe-36

    Methodology of Augmented Reality Chinese Language Articulatory Pronunciation Practice: Game and Study Design

    Get PDF
    Learning a language can be hard. Learning a language that contains tones to convey meaning is even harder. This dissertation presents a novel methodology for creating a language practice using augmented reality that has never been used before. The design of a new app in AR and non-AR versions can evaluate the same practice methodology. This methodology was applied to new software and was examined in regard to the importance of this software. Although the study results are inconclusive, progress has been made in answering research questions on the effectiveness of AR versus non-AR and the reliability of peer assessment. This study is essential for developing future language applications using design and methodologies in AR and peer evaluation

    Making Foreign Language Education Accessible Through Spanish Animation

    Get PDF
    The primary goal of this thesis is to design a second language acquisition Spanish YouTube series for young, English-speaking students in grades K-3, as well as to complete the animation of its pilot episode as a prototype. In order to achieve the final creative product, the preparation was threefold: I researched pedagogical strategies pertaining to language acquisition and the target age group, analyzed current early childhood foreign language resources, and demonstrated the importance and need for accessible, foreign language resources for young students through a review of academic studies. The series is titled ¡Moxie!, which focuses on the daily adventures of a small puppy named Moxie who only speaks Spanish. ¡Moxie!’s goal is Spanish language acquisition for monolingual English speakers, so its structure aligns with pedagogical principles like Stephen Krashen’s (Krashen) input hypothesis and other target language storytelling guides. However, this is a long-term goal, as it is important to recognize that the pilot episode is only a glimpse of what the entire series could achieve. Nevertheless, this journey of independent work and knowledge creation serves as a model for what an early childhood language acquisition YouTube resource could look like, as well as giving students the opportunity to explore the world of foreign language outside of the classroom

    Investigating spoken emotion : the interplay of language and facial expression

    Get PDF
    This thesis aims to investigate how spoken expressions of emotions are influenced by the characteristics of spoken language and the facial emotion expression. The first three chapters examined how production and perception of emotions differed between Cantonese (tone language) and English (non-tone language). The rationale for this contrast was that the acoustic property of Fundamental Frequency (F0) may be used differently in the production and perception of spoken expressions in tone languages as F0 may be preserved as a linguistic resource for the production of lexical tones. To test this idea, I first developed the Cantonese Audio-visual Emotional Speech (CAVES) database, which was then used as stimuli in all the studies presented in this thesis (Chapter 1). An emotion perception study was then conducted to examine how three groups of participants (Australian English, Malaysian Malay and Hong Kong Cantonese speakers) identified spoken expression of emotions that were produced in either English or Cantonese (Chapter 2). As one of the aims of this study was to disambiguate the effects of language from culture, these participants were selected on the basis that they either shared similarities in language type (non-tone language, Malay and English) or culture (collectivist culture, Cantonese and Malay). The results showed that a greater similarity in emotion perception was observed between those who spoke a similar type of language, as opposed to those who shared a similar culture. This suggests some intergroup differences in emotion perception may be attributable to cross-language differences. Following up on these findings, an acoustic analysis study (Chapter 3) showed that compared to English spoken expression of emotions, Cantonese expressions had less F0 related cues (median and flatter F0 contour) and also the use of F0 cues was different. Taken together, these results show that language characteristics (n F0 usage) interact with the production and perception of spoken expression of emotions. The expression of disgust was used to investigate how facial expressions of emotions affect speech articulation. The rationale for selecting disgust was that the facial expression of disgust involves changes to the mouth region such as closure and retraction of the lips, and these changes are likely to have an impact on speech articulation. To test this idea, an automatic lip segmentation and measurement algorithm was developed to quantify the configuration of the lips from images (Chapter 5). By comparing neutral to disgust expressive speech, the results showed that disgust expressive speech is produced with significantly smaller vertical mouth opening, greater horizontal mouth opening and lower first and second formant frequencies (F1 and F2). Overall, this thesis provides an insight into how aspects of expressive speech may be shaped by specific (language type) and universal (face emotion expression) factors

    Asian American: a personal exploration of my identities and some possible implications for teachers

    Get PDF
    As the population of Asian Americans in the United States grows fast, so does the incidence of racist attacks on Asian Americans. The urgency for anti-racist educators to commit to learning how to best serve Asian American children, their families, and their communities in accordance with antiracist, counter hegemonic linguistic practices, and culturally sustaining principles grows exponentially. Through a deep reflection on my personal and often painful experience as a Korean immigrant in the United States, I use an interdisciplinary approach including Socio- and Racio-linguistics, Social Psychology, Anthropology, and Culturally Sustaining Pedagogy, to analyze some of the challenges that I have experienced and observed throughout my life here as a student, teacher and permanent resident. My focus is primarily on three groups of Asian Americans from North Eastern Asia—China, Japan, and Korea. Included are some suggestions for teachers who want to learn more about recognizing, understanding, and being responsive to the myriad strengths that their Asian American students, families and communities bring. I conclude with an afterword that recent attacks on Asian Americans related to the COVID-19 crisis emboldened me to write

    Multi-language education for indigenous children in Taiwan

    Get PDF

    Uncovering the myth of learning to read Chinese characters: phonetic, semantic, and orthographic strategies used by Chinese as foreign language learners

    Get PDF
    Oral Session - 6A: Lexical modeling: no. 6A.3Chinese is considered to be one of the most challenging orthographies to be learned by non-native speakers, in particular, the character. Chinese character is the basic reading unit that converges sound, form and meaning. The predominant type of Chinese character is semantic-phonetic compound that is composed of phonetic and semantic radicals, giving the clues of the sound and meaning, respectively. Over the last two decades, psycholinguistic research has made significant progress in specifying the roles of phonetic and semantic radicals in character processing among native Chinese speakers …postprin
    • …
    corecore