578 research outputs found
Children\u27s Sensitivity to Pitch Variation in Language
Children acquire consonant and vowel categories by 12 months, but take much longer to learn to interpret perceptible variation. This dissertation considers children’s interpretation of pitch variation. Pitch operates, often simultaneously, at different levels of linguistic structure. English-learning children must disregard pitch at the lexical level—since English is not a tone language—while still attending to pitch for its other functions. Chapters 1 and 5 outline the learning problem and suggest ways children might solve it. Chapter 2 demonstrates that 2.5-year-olds know pitch cannot differentiate words in English. Chapter 3 finds that not until age 4–5 do children correctly interpret pitch cues to emotions. Chapter 4 demonstrates some sensitivity between 2.5 and 5 years to the pitch cue to lexical stress, but continuing difficulties at the older ages. These findings suggest a late trajectory for interpretation of prosodic variation; throughout, I propose explanations for this protracted time-course
RenderMe-360: A Large Digital Asset Library and Benchmarks Towards High-fidelity Head Avatars
Synthesizing high-fidelity head avatars is a central problem for computer
vision and graphics. While head avatar synthesis algorithms have advanced
rapidly, the best ones still face great obstacles in real-world scenarios. One
of the vital causes is inadequate datasets -- 1) current public datasets can
only support researchers to explore high-fidelity head avatars in one or two
task directions; 2) these datasets usually contain digital head assets with
limited data volume, and narrow distribution over different attributes. In this
paper, we present RenderMe-360, a comprehensive 4D human head dataset to drive
advance in head avatar research. It contains massive data assets, with 243+
million complete head frames, and over 800k video sequences from 500 different
identities captured by synchronized multi-view cameras at 30 FPS. It is a
large-scale digital library for head avatars with three key attributes: 1) High
Fidelity: all subjects are captured by 60 synchronized, high-resolution 2K
cameras in 360 degrees. 2) High Diversity: The collected subjects vary from
different ages, eras, ethnicities, and cultures, providing abundant materials
with distinctive styles in appearance and geometry. Moreover, each subject is
asked to perform various motions, such as expressions and head rotations, which
further extend the richness of assets. 3) Rich Annotations: we provide
annotations with different granularities: cameras' parameters, matting, scan,
2D/3D facial landmarks, FLAME fitting, and text description.
Based on the dataset, we build a comprehensive benchmark for head avatar
research, with 16 state-of-the-art methods performed on five main tasks: novel
view synthesis, novel expression synthesis, hair rendering, hair editing, and
talking head generation. Our experiments uncover the strengths and weaknesses
of current methods. RenderMe-360 opens the door for future exploration in head
avatars.Comment: Technical Report; Project Page: 36; Github Link:
https://github.com/RenderMe-360/RenderMe-36
Methodology of Augmented Reality Chinese Language Articulatory Pronunciation Practice: Game and Study Design
Learning a language can be hard. Learning a language that contains tones to convey meaning is even harder. This dissertation presents a novel methodology for creating a language practice using augmented reality that has never been used before. The design of a new app in AR and non-AR versions can evaluate the same practice methodology. This methodology was applied to new software and was examined in regard to the importance of this software. Although the study results are inconclusive, progress has been made in answering research questions on the effectiveness of AR versus non-AR and the reliability of peer assessment. This study is essential for developing future language applications using design and methodologies in AR and peer evaluation
Making Foreign Language Education Accessible Through Spanish Animation
The primary goal of this thesis is to design a second language acquisition Spanish YouTube series for young, English-speaking students in grades K-3, as well as to complete the animation of its pilot episode as a prototype. In order to achieve the final creative product, the preparation was threefold: I researched pedagogical strategies pertaining to language acquisition and the target age group, analyzed current early childhood foreign language resources, and demonstrated the importance and need for accessible, foreign language resources for young students through a review of academic studies.
The series is titled ¡Moxie!, which focuses on the daily adventures of a small puppy named Moxie who only speaks Spanish. ¡Moxie!’s goal is Spanish language acquisition for monolingual English speakers, so its structure aligns with pedagogical principles like Stephen Krashen’s (Krashen) input hypothesis and other target language storytelling guides. However, this is a long-term goal, as it is important to recognize that the pilot episode is only a glimpse of what the entire series could achieve. Nevertheless, this journey of independent work and knowledge creation serves as a model for what an early childhood language acquisition YouTube resource could look like, as well as giving students the opportunity to explore the world of foreign language outside of the classroom
Investigating spoken emotion : the interplay of language and facial expression
This thesis aims to investigate how spoken expressions of emotions are influenced by the characteristics of spoken language and the facial emotion expression. The first three chapters examined how production and perception of emotions differed between Cantonese (tone language) and English (non-tone language). The rationale for this contrast was that the acoustic property of Fundamental Frequency (F0) may be used differently in the production and perception of spoken expressions in tone languages as F0 may be preserved as a linguistic resource for the production of lexical tones. To test this idea, I first developed the Cantonese Audio-visual Emotional Speech (CAVES) database, which was then used as stimuli in all the studies presented in this thesis (Chapter 1). An emotion perception study was then conducted to examine how three groups of participants (Australian English, Malaysian Malay and Hong Kong Cantonese speakers) identified spoken expression of emotions that were produced in either English or Cantonese (Chapter 2). As one of the aims of this study was to disambiguate the effects of language from culture, these participants were selected on the basis that they either shared similarities in language type (non-tone language, Malay and English) or culture (collectivist culture, Cantonese and Malay). The results showed that a greater similarity in emotion perception was observed between those who spoke a similar type of language, as opposed to those who shared a similar culture. This suggests some intergroup differences in emotion perception may be attributable to cross-language differences. Following up on these findings, an acoustic analysis study (Chapter 3) showed that compared to English spoken expression of emotions, Cantonese expressions had less F0 related cues (median and flatter F0 contour) and also the use of F0 cues was different. Taken together, these results show that language characteristics (n F0 usage) interact with the production and perception of spoken expression of emotions. The expression of disgust was used to investigate how facial expressions of emotions affect speech articulation. The rationale for selecting disgust was that the facial expression of disgust involves changes to the mouth region such as closure and retraction of the lips, and these changes are likely to have an impact on speech articulation. To test this idea, an automatic lip segmentation and measurement algorithm was developed to quantify the configuration of the lips from images (Chapter 5). By comparing neutral to disgust expressive speech, the results showed that disgust expressive speech is produced with significantly smaller vertical mouth opening, greater horizontal mouth opening and lower first and second formant frequencies (F1 and F2). Overall, this thesis provides an insight into how aspects of expressive speech may be shaped by specific (language type) and universal (face emotion expression) factors
Asian American: a personal exploration of my identities and some possible implications for teachers
As the population of Asian Americans in the United States grows fast, so does the incidence of racist attacks on Asian Americans. The urgency for anti-racist educators to commit to learning how to best serve Asian American children, their families, and their communities in accordance with antiracist, counter hegemonic linguistic practices, and culturally sustaining principles grows exponentially. Through a deep reflection on my personal and often painful experience as a Korean immigrant in the United States, I use an interdisciplinary approach including Socio- and Racio-linguistics, Social Psychology, Anthropology, and Culturally Sustaining Pedagogy, to analyze some of the challenges that I have experienced and observed throughout my life here as a student, teacher and permanent resident. My focus is primarily on three groups of Asian Americans from North Eastern Asia—China, Japan, and Korea. Included are some suggestions for teachers who want to learn more about recognizing, understanding, and being responsive to the myriad strengths that their Asian American students, families and communities bring. I conclude with an afterword that recent attacks on Asian Americans related to the COVID-19 crisis emboldened me to write
Uncovering the myth of learning to read Chinese characters: phonetic, semantic, and orthographic strategies used by Chinese as foreign language learners
Oral Session - 6A: Lexical modeling: no. 6A.3Chinese is considered to be one of the most challenging orthographies to be learned by non-native speakers, in particular, the character. Chinese character is the basic reading unit that converges sound, form and meaning. The predominant type of Chinese character is semantic-phonetic compound that is composed of phonetic and semantic radicals, giving the clues of the sound and meaning, respectively. Over the last two decades, psycholinguistic research has made significant progress in specifying the roles of phonetic and semantic radicals in character processing among native Chinese speakers …postprin
- …