39,334 research outputs found

    Adapting an ASR Foundation Model for Spoken Language Assessment

    Full text link
    A crucial part of an accurate and reliable spoken language assessment system is the underlying ASR model. Recently, large-scale pre-trained ASR foundation models such as Whisper have been made available. As the output of these models is designed to be human readable, punctuation is added, numbers are presented in Arabic numeric form and abbreviations are included. Additionally, these models have a tendency to skip disfluencies and hesitations in the output. Though useful for readability, these attributes are not helpful for assessing the ability of a candidate and providing feedback. Here a precise transcription of what a candidate said is needed. In this paper, we give a detailed analysis of Whisper outputs and propose two solutions: fine-tuning and soft prompt tuning. Experiments are conducted on both public speech corpora and an English learner dataset. Results show that we can effectively alter the decoding behaviour of Whisper to generate the exact words spoken in the response.Comment: Proceedings of SLaT

    Analyzing the Targets of Hate in Online Social Media

    Full text link
    Social media systems allow Internet users a congenial platform to freely express their thoughts and opinions. Although this property represents incredible and unique communication opportunities, it also brings along important challenges. Online hate speech is an archetypal example of such challenges. Despite its magnitude and scale, there is a significant gap in understanding the nature of hate speech on social media. In this paper, we provide the first of a kind systematic large scale measurement study of the main targets of hate speech in online social media. To do that, we gather traces from two social media systems: Whisper and Twitter. We then develop and validate a methodology to identify hate speech on both these systems. Our results identify online hate speech forms and offer a broader understanding of the phenomenon, providing directions for prevention and detection approaches.Comment: Short paper, 4 pages, 4 table

    Color and texture associations in voice-induced synesthesia

    Get PDF
    Voice-induced synesthesia, a form of synesthesia in which synesthetic perceptions are induced by the sounds of people's voices, appears to be relatively rare and has not been systematically studied. In this study we investigated the synesthetic color and visual texture perceptions experienced in response to different types of “voice quality” (e.g., nasal, whisper, falsetto). Experiences of three different groups—self-reported voice synesthetes, phoneticians, and controls—were compared using both qualitative and quantitative analysis in a study conducted online. Whilst, in the qualitative analysis, synesthetes used more color and texture terms to describe voices than either phoneticians or controls, only weak differences, and many similarities, between groups were found in the quantitative analysis. Notable consistent results between groups were the matching of higher speech fundamental frequencies with lighter and redder colors, the matching of “whispery” voices with smoke-like textures, and the matching of “harsh” and “creaky” voices with textures resembling dry cracked soil. These data are discussed in the light of current thinking about definitions and categorizations of synesthesia, especially in cases where individuals apparently have a range of different synesthetic inducers

    Faculty concert: Penelope Bitzas, Shiela Kibbe, and Eric Ruske, January 22, 1999

    Full text link
    This is the concert program of the Faculty Concert of Penelope Bitzas, Shiela Kibbe, and Eric Ruske performance on Friday, January 22, 1999 at 8:00 p.m., at the Tsai Performance Center, 685 Commonwealth Avenue, Boston, Massachusetts. Works performed were E ingrato lo veggio, from Adriano in Siria by Giovanni Pergolesi; Confusa, smarrita, spiegarti, from Cantone by Baldassare Galuppi; Songs and Dances of Death by Modeste Mussorgsky; Gondoliera, Nimmer denkst du mein, and Der Traum der ersten Liebe by Heinrich Esser; Tonadillas by Enrique Granados; Lass from the Low Countree by John Jacob Niles; West London by Charles Ives; Why Don't You? by Lee Hoiby; and The Frog and the Snake by Irving Fine. Digitization for Boston University Concert Programs was supported by the Boston University Humanities Library Endowed Fund

    [Review of] Silvester Brito. Red Cedar Warrior

    Get PDF
    Red Cedar Warrior, the collection of poems by S.J. Brito, is very obvious in its depiction of trepidations against Native Americans, in its mourning for the loss of culture and traditions, and its expression of anger. We easily see the obvious signs of Native Americanism in most of the poems included in his book. The warrior could not be anything other than Native American, astride a pony, feathered and painted. There are the drums, the ceremonial life, the peyote prayers, the shamans, and such references. We easily see the images and hear the voices that most let us know of the poet\u27s intent to share with us a Native American viewpoint. And why not? After all, Brito is a proud descendant of Comanches and Tarascans

    Towards Generalizable SER: Soft Labeling and Data Augmentation for Modeling Temporal Emotion Shifts in Large-Scale Multilingual Speech

    Full text link
    Recognizing emotions in spoken communication is crucial for advanced human-machine interaction. Current emotion detection methodologies often display biases when applied cross-corpus. To address this, our study amalgamates 16 diverse datasets, resulting in 375 hours of data across languages like English, Chinese, and Japanese. We propose a soft labeling system to capture gradational emotional intensities. Using the Whisper encoder and data augmentation methods inspired by contrastive learning, our method emphasizes the temporal dynamics of emotions. Our validation on four multilingual datasets demonstrates notable zero-shot generalization. We publish our open source model weights and initial promising results after fine-tuning on Hume-Prosody.Comment: Accepted as talk at NeurIPS ML for Audio worksho

    When the Bloom is on the Cotton Dixie Lee

    Get PDF
    VERSE 1Round my southern home the cotton fields are blooming,Far away the river glistens ‘neath the moon;Through the twilight comes the breath of clover blossoms,Far away the darkies sing a southern tune.Side by side down by the flowing stream we wandered,On your face the moonlight cast a golden glow;And I kissed away the tears when you were cryingAs I said “Goodbye” and whispered soft and low. REFRAINWhen the bloom is on the cotton, Dixie Lee,Life’s sun will shine again for you and me;I’ll return to you once more, we’ll be happy as of yoreWhen the bloom is on the cotton, Dixie Lee. VERSE 2In my dreams tonight I’m roaming with you Dixie,While the cotton fields are all abloom once more;I can see the soft moonlight upon the river,And your sweet face as you stroll along the shore.But we nevermore will wander down the pathway,As we did the night you gave your heart to me,For the tolling bells they tell the sad, and storyWhile the breezes whisper “Farewell” Dixie Lee. REFRAI
    • 

    corecore