7,277 research outputs found

    Music 2025 : The Music Data Dilemma: issues facing the music industry in improving data management

    Get PDF
    © Crown Copyright 2019Music 2025ʼ investigates the infrastructure issues around the management of digital data in an increasingly stream driven industry. The findings are the culmination of over 50 interviews with high profile music industry representatives across the sector and reflects key issues as well as areas of consensus and contrasting views. The findings reveal whilst there are great examples of data initiatives across the value chain, there are opportunities to improve efficiency and interoperability

    Current Challenges and Visions in Music Recommender Systems Research

    Full text link
    Music recommender systems (MRS) have experienced a boom in recent years, thanks to the emergence and success of online streaming services, which nowadays make available almost all music in the world at the user's fingertip. While today's MRS considerably help users to find interesting music in these huge catalogs, MRS research is still facing substantial challenges. In particular when it comes to build, incorporate, and evaluate recommendation strategies that integrate information beyond simple user--item interactions or content-based descriptors, but dig deep into the very essence of listener needs, preferences, and intentions, MRS research becomes a big endeavor and related publications quite sparse. The purpose of this trends and survey article is twofold. We first identify and shed light on what we believe are the most pressing challenges MRS research is facing, from both academic and industry perspectives. We review the state of the art towards solving these challenges and discuss its limitations. Second, we detail possible future directions and visions we contemplate for the further evolution of the field. The article should therefore serve two purposes: giving the interested reader an overview of current challenges in MRS research and providing guidance for young researchers by identifying interesting, yet under-researched, directions in the field

    트랜스포머 기반 음악 스트리밍 세션 추천 시스템

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 데이터사이언스대학원 데이터사이언스학과, 2022. 8. 신효필.Recommendation systems have grown in popularity over the last few years, with the rise of big data and development of computing resources. Compared to simple rule based methods or content based filtering methods used for recommendation during the early development stage of recommendation systems, recent methodologies try to implement much more complex models. Latent factor models and collaborative filtering methods were developed to find similarities between users and items without actually knowing their characteristics, and gained popularity. Various item domains, mainly movie and retail, have extensively used these recommendation algorithms. With the development of deep learning architectures, various deep learning based recommendation systems emerged in recent years. While a lot of them were focused on generating the predicted item ratings when given a big data comprised of user ids, item ids, and ratings, there were some efforts to generate next-item recommendations as well. Next-item recommendations receive a session or sequence of actions by some user, and try to predict the next action of a user. NVIDIA recently used Transformers, a deep learning architecture in the field of Natural Language Processing (NLP), to build a session based recommendation system called Transformers4Rec. The system showed state of the art performances for the usual movie and retail domains. In the music domain, unfortunately, advanced models for session-based recommendations have been explored to a small extent. Therefore, this thesis will attempt to apply Transformer based architectures to session-based recommendation for music streaming, by utilizing a dataset from Spotify and framework from NVIDIA. In this thesis, unique characteristics of music data that validates this research’s purpose are explored. The effectiveness of Transformer architectures on music data are shown with next-item prediction performances on actual user streaming session data, and methods for feature engineering and data preprocessing to ensure the best prediction results are investigated. An empirical analysis that compares various Transformer architectures is also provided, with models further analyzed with additional feature information.최근 트랜스포머 기반 추천시스템들이 다양한 분야에서 높은 성능을 보여왔다. 하지만 음악 스트리밍 분야에는 적용되지 않았었고, 이 논문을 통해 음악 스트리밍 분야에 트랜스포머 기반 세션 추천시스템이 어떤 성능을 보여주는지 탐색해 보았다. 데이터 전처리를 통해 유저들이 음악을 실제로 좋아해서 들었을 법한 세션들만 남기려 노력했고, 세션 기반 추천시스템에 맞게 데이터를 정제했다. 음악과 관련된 다양한 정보들도 모델 훈련에 반영하기 위해 카테고리 형태로 바꿔주었고, 훈련 자체는 세션 기반 추천시스템에서 자주 쓰이는 점진적 훈련법을 활용했다. 최종 실험 결과에서는 데이터의 비정제성과 비밀집성을 극복하고 비슷한 데이터셋과 경쟁력을 갖추는 성과를 보여주었다. 이 연구를 통해 음악 스트리밍 세션 추천시스템에 트랜스포머 기반 모델이라는 새로운 가능성을 보여 주었고, 추후 연구자들이 참고할 수 있는 시작점을 제공하였다.1 Introduction 1 1.1 Research Topic 1 1.2 Purpose of Research 2 1.3 Need for Research 3 1.3.1 Recent Trends 3 1.3.2 Dataset Characteristics 4 2 Related Works 9 2.1 Overview of NLP and RecSys 9 2.2 Past Works on Incorporating Features 12 3 Methodology 13 3.1 Music Streaming Sessions Dataset 13 3.2 Music Recommendation Model 14 3.2.1 NVTabular 15 3.2.2 Transformers4Rec 15 3.3 Feature Embeddings 16 3.4 Session Information 17 3.5 Transformer Architectures 18 3.6 Metrics 19 4 Experiments 21 4.1 Data Preprocessing 21 4.2 Embedding 23 4.2.1 No features 23 4.2.2 Session features 23 4.2.3 Song features 25 4.3 Hyperparameters 27 4.4 Training 28 4.4.1 Problem Statement 28 4.4.2 Pipeline 28 4.4.3 Incremental Training, Evaluation 29 4.5 Results 30 4.5.1 Simple item IDs 30 4.5.2 Item IDs + Session Information 31 4.5.3 Item IDs + Session Information + Track Metadata 32 5 Conclusion and Future Works 34 Bibliography 36 초 록 41석

    Analysis of the music industry today

    Get PDF
    Treball Final de Grau en Administració d'Empreses. Codi: AE1049. Curs 2020/2021The music industry has undergone many changes in just 20 years, going from the physical format such as vinyl to streaming platforms such as Spotify. This industry is characterized by being very dynamic and in constant movement and has gone through several key revolutions such as: the arrival of the Internet that caused the consumption of pirated music through platforms like Napster or the peer-to-peer network, the birth of iTunes where it was the first to make an easy and simple process of selling music in digital format, and finally the consumption of music through streaming platforms. Along the way there have been several changes in business models, changes in record labels, new roles and agents in the value chain, new ways of monetizing music, new habits of consuming music.... And all these challenges have meant that the music industry has had to adapt by innovating until reaching the current music industry as we know it today. In this project we will describe the music industry from live music to recorded music, explaining the changes that the industry has had to face and its main characteristics. We will also go into more detail about recorded music and its evolution in Spain and we will make clear both the processes that form the value chain and the agents that are involved

    Explainability in Music Recommender Systems

    Full text link
    The most common way to listen to recorded music nowadays is via streaming platforms which provide access to tens of millions of tracks. To assist users in effectively browsing these large catalogs, the integration of Music Recommender Systems (MRSs) has become essential. Current real-world MRSs are often quite complex and optimized for recommendation accuracy. They combine several building blocks based on collaborative filtering and content-based recommendation. This complexity can hinder the ability to explain recommendations to end users, which is particularly important for recommendations perceived as unexpected or inappropriate. While pure recommendation performance often correlates with user satisfaction, explainability has a positive impact on other factors such as trust and forgiveness, which are ultimately essential to maintain user loyalty. In this article, we discuss how explainability can be addressed in the context of MRSs. We provide perspectives on how explainability could improve music recommendation algorithms and enhance user experience. First, we review common dimensions and goals of recommenders' explainability and in general of eXplainable Artificial Intelligence (XAI), and elaborate on the extent to which these apply -- or need to be adapted -- to the specific characteristics of music consumption and recommendation. Then, we show how explainability components can be integrated within a MRS and in what form explanations can be provided. Since the evaluation of explanation quality is decoupled from pure accuracy-based evaluation criteria, we also discuss requirements and strategies for evaluating explanations of music recommendations. Finally, we describe the current challenges for introducing explainability within a large-scale industrial music recommender system and provide research perspectives.Comment: To appear in AI Magazine, Special Topic on Recommender Systems 202

    Deep Learning based Recommender System: A Survey and New Perspectives

    Full text link
    With the ever-growing volume of online information, recommender systems have been an effective strategy to overcome such information overload. The utility of recommender systems cannot be overstated, given its widespread adoption in many web applications, along with its potential impact to ameliorate many problems related to over-choice. In recent years, deep learning has garnered considerable interest in many research fields such as computer vision and natural language processing, owing not only to stellar performance but also the attractive property of learning feature representations from scratch. The influence of deep learning is also pervasive, recently demonstrating its effectiveness when applied to information retrieval and recommender systems research. Evidently, the field of deep learning in recommender system is flourishing. This article aims to provide a comprehensive review of recent research efforts on deep learning based recommender systems. More concretely, we provide and devise a taxonomy of deep learning based recommendation models, along with providing a comprehensive summary of the state-of-the-art. Finally, we expand on current trends and provide new perspectives pertaining to this new exciting development of the field.Comment: The paper has been accepted by ACM Computing Surveys. https://doi.acm.org/10.1145/328502

    Reconocimiento de emociones de la voz aplicado sobre una arquitectura Cloud serverless

    Get PDF
    Trabajo de Fin de Grado en Ingeniería Informática, Facultad de Informática UCM, Departamento de Arquitectura de Computadores y Automática, Curso 2021-2022. The source code of this project can be found both in GitHub and Google Drive: https://github.com/RobertFarzan/Speech-Emotion-Recognition-system https://drive.google.com/file/d/1XobYLxcARE73EFwZ3VUr6Po7vum42ajh/view?usp=sharingThe purpose of this final degree thesis Applied speech emotion recognition on a serverless Cloud architecture is to do research into emotion recognition on human voice through several techniques including audio signal processing and deep learning technologies to classify a certain emotion detected on a piece of audio, as well as finding ways to deploy this functionality on Cloud (serverless). From there we can get a brief implementation of a streaming nearly real-time system in which an end user could record audio and retrieve responses of the emotions continuously. The idea intends to be a "emotion tracking system" that couples the technologies mentioned above along with a simple end-user GUI app that anyone could use purposefully to track their own voices in different situations - during a call, a meeting etc. - and get a brief summary visualization of their emotions across time with just a quick glance. This prototype seems to be one of the first software products of its kind, as there is a lot of literature on the Internet on Speech Emotion Recognition and tools for software engineers to facilitate this task but an easy final user product or solution for real-time SER appears to be non-existent. As a short summary of the project road map and the technologies involved, the process is as follows: development of a CNN model on Tensorflow 2.0 (with Python) to get emotion labels as output from a short chunk of audio as input; deployment of a Python script that uses this previously mentioned CNN model to return the emotion predictions in AWS Lambda (the Amazon service for serverless Cloud); and finally the design of a Python app with GUI integrated to send requests to the Lambda service and retrieve the responses with emotion predictions to present them with beautiful visualizations.El propósito de este TFG Reconocimiento de emociones de la voz aplicado sobre una arquitectura Clous serverless es investigar el reconocimiento de emociones en la voz humana usando diversas técnicas, entre las que se incluye el procesamiento de señal y deep learning para clasificar una cierta emoción en una pieza de audio, así como encontrar maneras de desplegar esta funcionalidad en el Cloud (serverless). A partir de estos pasos se podrá obtener una implementación de un sistema en streaming en tiempo cuasi real, en el que un usuario pueda grabarse a sí mismo y recibir respuestas cronológicas sobre su estado de ánimo continuamente. Esta idea trata de ser un "sistema monitor de emociones", que envuelva las tecnologías mencionadas arriba junto con una simple interfaz gráfica de usuario que cualquiera pueda usar para monitorizar intencionadamente su voz en diferentes situaciones - durante una llamada, una reunión etc. - y obtener una breve visualización de sus emociones a lo largo del tiempo en un simple vistazo. Este prototipo apunta a ser una de las primeras soluciones software de este tipo, ya que a pesar de haber mucha literatura en Internet acerca de Speech Emotion Recognition y herramientas para desarrolladores en esta tarea, parece no haber productos o soluciones de SER en tiempo real para usuarios. Como breve resumen de la hoja de ruta del proyecto y las tecnologías involucradas, el proceso es el siguiente: desarrollo de una red neuronal convolucional en TensorFlow 2.0 (con Python) para predecir emociones a partir de una pieza de audio como input; despliegue de un script de Python que use la red neuronal para devolver predicciones en AWS Lambda (el servicio de Amazon para serverless); y finalmente el diseño de una aplicación final para usuario en Python que incluya una interfaz gráfica que se conecte con los servicios de Lambda y devuelva respuestas con las predicciones y haga visualizaciones a partir de ellas.Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Using HeidiSongs Music as an Instructional Tool in the Elementary School Classroom: A Case Study

    Get PDF
    The purpose of this qualitative multiple case study is to understand how teachers use HeidiSongs music as an instructional tool in the elementary school classroom. HeidiSongs uses multisensory structured language education to teach by engaging multiple senses simultaneously to increase retention. The theories guiding this study include Gardner’s theory of multiple intelligences, which involves kinesthetic intelligences among other types of intelligences, and Krashen’s theory of second language acquisition. HeidiSongs uses both musical and kinesthetic activities to enhance literacy. The central research question focused on how teachers use HeidiSongs music as an instructional tool in the elementary school classroom. The sub-questions explored the different instructional settings where this literacy instruction could take place: whole group, small group, and individual instruction. Eleven participants were current or former users of HeidiSongs music, and data was collected virtually through documentation, individual interviews, and a single focus group interview. Data was analyzed through cross-case synthesis, searching for patterns, forming naturalistic generalizations, and explanation building. Findings indicated HeidiSongs is most applicable in the whole group setting in the elementary school classroom, with teachers and students using recall of the songs in small group and individual worktime to enhance memory. Teachers enjoyed the combination of multisensory music and movement in HeidiSongs and reported an overall positive effect on student engagement, even in diverse populations. Further research on instructional data distinguishing between audio, visual, or animated versions of the songs could help teachers determine which version of the songs is most ideal for each classroom
    corecore