326 research outputs found

    The folksong jukebox: Singing along for social change in rural India

    Get PDF
    In designing digital literacy content for marginalized demographics, we need to garner local resources to structure engaging and meaningful media experiences. This paper examines the socio-cognitive implications of a novel edutainment product in rural India on learning, stemming from an e-development initiative funded by Hewlett-Packard. This product encapsulates a multiplicity of media forms: text, audio and visual, with social-awareness folk themes endemic to the locality. It uses the karaoke 'same language subtitling' feature that won the World Bank Development Marketplace Award in 2002 due to its simple yet innovative application that has proven to have an impact on reading skills. The product strives to combine cultural regeneration, value-based education, incidental literacy and language practice through entertainment. The paper investigates how this product addresses engagement and empowerment simultaneously, based on elements such as emot

    Alquist 5.0: Dialogue Trees Meet Generative Models. A Novel Approach for Enhancing SocialBot Conversations

    Full text link
    We present our SocialBot -- Alquist~5.0 -- developed for the Alexa Prize SocialBot Grand Challenge~5. Building upon previous versions of our system, we introduce the NRG Barista and outline several innovative approaches for integrating Barista into our SocialBot, improving the overall conversational experience. Additionally, we extend our SocialBot to support multimodal devices. This paper offers insights into the development of Alquist~5.0, which meets evolving user expectations while maintaining empathetic and knowledgeable conversational abilities across diverse topics

    Lyrics-to-Audio Alignment and its Application

    Get PDF
    Automatic lyrics-to-audio alignment techniques have been drawing attention in the last years and various studies have been made in this field. The objective of lyrics-to-audio alignment is to estimate a temporal relationship between lyrics and musical audio signals and can be applied to various applications such as Karaoke-style lyrics display. In this contribution, we provide an overview of recent development in this research topic, where we put a particular focus on categorization of various methods and on applications

    Self-Supervised Representation Learning for Vocal Music Context

    Full text link
    In music and speech, meaning is derived at multiple levels of context. Affect, for example, can be inferred both by a short sound token and by sonic patterns over a longer temporal window such as an entire recording. In this paper we focus on inferring meaning from this dichotomy of contexts. We show how contextual representations of short sung vocal lines can be implicitly learned from fundamental frequency (F0F_0) and thus be used as a meaningful feature space for downstream Music Information Retrieval (MIR) tasks. We propose three self-supervised deep learning paradigms which leverage pseudotask learning of these two levels of context to produce latent representation spaces. We evaluate the usefulness of these representations by embedding unseen vocal contours into each space and conducting downstream classification tasks. Our results show that contextual representation can enhance downstream classification by as much as 15 % as compared to using traditional statistical contour features.Comment: Working on more updated versio
    • …
    corecore