6 research outputs found

    Music emotion recognition: a multimodal machine learning approach

    Get PDF
    Music emotion recognition (MER) is an emerging domain of the Music Information Retrieval (MIR) scientific community, and besides, music searches through emotions are one of the major selection preferred by web users. As the world goes to digital, the musical contents in online databases, such as Last.fm have expanded exponentially, which require substantial manual efforts for managing them and also keeping them updated. Therefore, the demand for innovative and adaptable search mechanisms, which can be personalized according to users’ emotional state, has gained increasing consideration in recent years. This thesis concentrates on addressing music emotion recognition problem by presenting several classification models, which were fed by textual features, as well as audio attributes extracted from the music. In this study, we build both supervised and semisupervised classification designs under four research experiments, that addresses the emotional role of audio features, such as tempo, acousticness, and energy, and also the impact of textual features extracted by two different approaches, which are TF-IDF and Word2Vec. Furthermore, we proposed a multi-modal approach by using a combined feature-set consisting of the features from the audio content, as well as from context-aware data. For this purpose, we generated a ground truth dataset containing over 1500 labeled song lyrics and also unlabeled big data, which stands for more than 2.5 million Turkish documents, for achieving to generate an accurate automatic emotion classification system. The analytical models were conducted by adopting several algorithms on the crossvalidated data by using Python. As a conclusion of the experiments, the best-attained performance was 44.2% when employing only audio features, whereas, with the usage of textual features, better performances were observed with 46.3% and 51.3% accuracy scores considering supervised and semi-supervised learning paradigms, respectively. As of last, even though we created a comprehensive feature set with the combination of audio and textual features, this approach did not display any significant improvement for classification performanc

    Dance like nobody\u27s paying : Spotify and Surveillance as the Soundtrack of Our Lives

    Get PDF
    This thesis examines Spotify, the world’s most popular music streaming service, and its usage of music as a data extraction tool. I position Spotify as a surveillance capitalist firm that puts music at the centre of an enclosed environment designed to condition users’ affective responses and behaviors and reorient production of music. I analyze three features of the platform: a campaign in which Spotify invites users and producers to share the data it collects about them, the arrangement of the platform’s architecture into mood-based playlists, and its penchant for music that is “Chill.” I show how each serves the surveillance machine’s goals of collecting and contextualizing data from music and music consumption that it claims can quantify, predict, and condition behaviour. Using a framework of social and economic theory alongside data and musical analysis, I position Spotify and its exploitation of music within broader implications of life under surveillance capitalism

    Expressive language development in minimally verbal autistic children: exploring the role of speech production

    Get PDF
    Trajectories of expressive language development are highly heterogeneous in autism. I examine the hypothesis that co-morbid speech production difficulties may be a contributing factor for some minimally verbal autistic individuals. Chapters 1 and 2 provide an overview of language variation within autism, and existing intervention approaches for minimally verbal autistic children. These chapters situate this thesis within the existing literature. Chapter 3 describes a longitudinal study of expressive language in minimally verbal 3-5 year olds (n=27), with four assessment points over 12 months. Contrary to expectations, initial communicative intent, parent responsiveness and response to joint attention did not predict expressive language growth or outcome. Speech skills were significant predictors. Chapter 4 describes the design, development and feasibility testing of the BabbleBooster app, a novel, parent-meditated speech skills intervention, in which 19 families participated for 16 weeks. Acceptability feedback was positive but adherence was variable. I discuss how this could be improved in future iterations of the app and intervention protocol. Chapter 5 details how BabbleBooster’s efficacy was evaluated. For interventions with complex or rare populations, a randomized case series design is a useful alternative to an under-powered group trial. There was no evidence that BabbleBooster improved speech production scores, likely due to limited dosage. Future research using this study design could determine optimal treatment intensity and duration with an improved version of the app. Taken together, these studies underscore the contribution of speech production abilities to expressive language development in minimally verbal autistic individuals. I argue that this reflects an additional condition, and is not a consequence of core autism features. The intervention piloted here represents a first step towards developing a scalable tool for parents to support speech development in minimally verbal children, and illustrates the utility of randomized single case series for testing treatment effects in small, heterogeneous cohorts

    Proceedings of the 19th Sound and Music Computing Conference

    Get PDF
    Proceedings of the 19th Sound and Music Computing Conference - June 5-12, 2022 - Saint-Étienne (France). https://smc22.grame.f

    Folk Theories, Recommender Systems, and Human-Centered Explainable Artificial Intelligence (HCXAI)

    Get PDF
    This study uses folk theories to enhance human-centered “explainable AI” (HCXAI). The complexity and opacity of machine learning has compelled the need for explainability. Consumer services like Amazon, Facebook, TikTok, and Spotify have resulted in machine learning becoming ubiquitous in the everyday lives of the non-expert, lay public. The following research questions inform this study: What are the folk theories of users that explain how a recommender system works? Is there a relationship between the folk theories of users and the principles of HCXAI that would facilitate the development of more transparent and explainable recommender systems? Using the Spotify music recommendation system as an example, 19 Spotify users were surveyed and interviewed to elicit their folk theories of how personalized recommendations work in a machine learning system. Seven folk theories emerged: complies, dialogues, decides, surveils, withholds and conceals, empathizes, and exploits. These folk theories support, challenge, and augment the principles of HCXAI. Taken collectively, the folk theories encourage HCXAI to take a broader view of XAI. The objective of HCXAI is to move towards a more user-centered, less technically focused XAI. The elicited folk theories indicate that this will require adopting principles that include policy implications, consumer protection issues, and concerns about intention and the possibility of manipulation. As a window into the complex user beliefs that inform their iii interactions with Spotify, the folk theories offer insights into how HCXAI systems can more effectively provide machine learning explainability to the non-expert, lay public
    corecore