2,771 research outputs found

    Multimodal Content Analysis for Effective Advertisements on YouTube

    Full text link
    The rapid advances in e-commerce and Web 2.0 technologies have greatly increased the impact of commercial advertisements on the general public. As a key enabling technology, a multitude of recommender systems exists which analyzes user features and browsing patterns to recommend appealing advertisements to users. In this work, we seek to study the characteristics or attributes that characterize an effective advertisement and recommend a useful set of features to aid the designing and production processes of commercial advertisements. We analyze the temporal patterns from multimedia content of advertisement videos including auditory, visual and textual components, and study their individual roles and synergies in the success of an advertisement. The objective of this work is then to measure the effectiveness of an advertisement, and to recommend a useful set of features to advertisement designers to make it more successful and approachable to users. Our proposed framework employs the signal processing technique of cross modality feature learning where data streams from different components are employed to train separate neural network models and are then fused together to learn a shared representation. Subsequently, a neural network model trained on this joint feature embedding representation is utilized as a classifier to predict advertisement effectiveness. We validate our approach using subjective ratings from a dedicated user study, the sentiment strength of online viewer comments, and a viewer opinion metric of the ratio of the Likes and Views received by each advertisement from an online platform.Comment: 11 pages, 5 figures, ICDM 201

    Music recommender systems. Proof of concept

    Get PDF
    Data overload is a well-known problem due to the availability of big on-line distributed databases. While providing a wealth of information the difficulties to find the sought data and the necessary time spent in the search call for technological solutions. Classical search engines alleviate this problem and at the same time have transformed the way people access to the information they are interested in. On the other hand, Internet also has changed the music consuming habits around the world. It is possible to find almost every recorded song or music piece. Over the last years music streaming platforms like Spotify, Apple Music or Amazon Music have contributed to a substantial change of users’ listening habits and the way music is commercialized and distributed. On-demand music platforms offer their users a huge catalogue so they can do a quick search and listen what they want or build up their personal library. In this context Music Recommender Systems may help users to discover music that match their tastes. Therefore music recommender systems are a powerful tool to make the most of an immense catalogue, impossible to be fully known by a human. This project aims at testing different music recommendation approaches applied to the particular case of users playlists. Several recommender alternatives were designed and evaluated: collaborative filtering systems, content-based systems and hybrid recommender systems that combine both techniques. Two systems are proposed. One system is content-based and uses correlation between tracks characterized by high-level descriptors and the other is an hybrid recommender that first apply a collaborative method to filter the database and then computes the final recommendation using Gaussian Mixture Models. Recommendations were evaluated using objective metrics and human evaluations, obtaining positive results.Ingeniería de Sistemas Audiovisuale

    Encouraging attention and exploration in a hybrid recommender system for libraries of unfamiliar music

    Get PDF
    There are few studies of user interaction with music libraries comprising solely of unfamiliar music, despite such music being represented in national music information centre collections. We aim to develop a system that encourages exploration of such a library. This study investigates the influence of 69 users’ pre-existing musical genre and feature preferences on their ongoing continuous real-time psychological affect responses during listening and the acoustic features of the music on their liking and familiarity ratings for unfamiliar art music (the collection of the Australian Music Centre) during a sequential hybrid recommender-guided interaction. We successfully mitigated the unfavorable starting conditions (no prior item ratings or participants’ item choices) by using each participant’s pre-listening music preferences, translated into acoustic features and linked to item view count from the Australian Music Centre database, to choose their seed item. We found that first item liking/familiarity ratings were on average higher than the subsequent 15 items and comparable with the maximal values at the end of listeners’ sequential responses, showing acoustic features to be useful predictors of responses. We required users to give a continuous response indication of their perception of the affect expressed as they listened to 30-second excerpts of music, with our system successfully providing either a “similar” or “dissimilar” next item, according to—and confirming—the utility of the items’ acoustic features, but chosen from the affective responses of the preceding item. We also developed predictive statistical time series analysis models of liking and familiarity, using music preferences and preceding ratings. Our analyses suggest our users were at the starting low end of the commonly observed inverted-U relationship between exposure and both liking and perceived familiarity, which were closely related. Overall, our hybrid recommender worked well under extreme conditions, with 53 unique items from 100 chosen as “seed” items, suggesting future enhancement of our approach can productively encourage exploration of libraries of unfamiliar music

    Enhancing film sound design using audio features, regression models and artificial neural networks

    Get PDF
    This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of New Music Research on 21/09/2021, available online: https://doi.org/10.1080/09298215.2021.1977336Making the link between human emotion and music is challenging. Our aim was to produce an efficient system that emotionally rates songs from multiple genres. To achieve this, we employed a series of online self-report studies, utilising Russell's circumplex model. The first study (n = 44) identified audio features that map to arousal and valence for 20 songs. From this, we constructed a set of linear regressors. The second study (n = 158) measured the efficacy of our system, utilising 40 new songs to create a ground truth. Results show our approach may be effective at emotionally rating music, particularly in the prediction of valence

    Mel-Frequency Cepstral Coefficients and Convolutional Neural Network for Genre Classification of Indigenous Nigerian Music

    Get PDF
    Music genre classification is a field of study within the broader domain of Music Information Retrieval (MIR) that is still an open problem. This study aims at classifying music by Nigerian artists into respective genres using Convolutional Neural Networks (CNNs) and audio features extracted from the songs. To achieve this, a dataset of 524 Nigerian songs was collected from different genres. Each downloaded music file was converted from standard MP3 to WAV format and then trimmed to 30 seconds. The Librosa sc library was used for the analysis, visualization and further pre-processing of the music file which includes converting the audio signals to Mel-frequency cepstral coefficients (MFCCs). The MFCCs were obtained by taking performing a Discrete Cosine Transform on the logarithm of the Mel-scale filtered power spectrum of the audio signals. CNN architecture with multiple convolutional and pooling layers was used to learn the relevant features and classify the genres. Six models were trained using a categorical cross-entropy loss function with different learning rates and optimizers. Performance of the models was evaluated using accuracy, precision, recall, and F1-score. The models returned varying results from the classification experiments but model 3 which was trained with an Adagrad optimizer and learning rate of 0.01 had accuracy and recall of 75.1% and 84%, respectively. The results from the study demonstrated the effectiveness of MFCC and CNNs in music genre classification particularly with indigenous Nigerian artists

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Affective Music Information Retrieval

    Full text link
    Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this article, we present a novel generative approach to music emotion modeling, with a specific focus on the valence-arousal (VA) dimension model of emotion. The presented generative model, called \emph{acoustic emotion Gaussians} (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition and emotion-based music retrieval. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.Comment: 40 pages, 18 figures, 5 tables, author versio

    Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text

    Full text link
    Collaborative filtering (CF) is the key technique for recommender systems (RSs). CF exploits user-item behavior interactions (e.g., clicks) only and hence suffers from the data sparsity issue. One research thread is to integrate auxiliary information such as product reviews and news titles, leading to hybrid filtering methods. Another thread is to transfer knowledge from other source domains such as improving the movie recommendation with the knowledge from the book domain, leading to transfer learning methods. In real-world life, no single service can satisfy a user's all information needs. Thus it motivates us to exploit both auxiliary and source information for RSs in this paper. We propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH) methods for cross-domain recommendation with unstructured text in an end-to-end manner. TMH attentively extracts useful content from unstructured text via a memory module and selectively transfers knowledge from a source domain via a transfer network. On two real-world datasets, TMH shows better performance in terms of three ranking metrics by comparing with various baselines. We conduct thorough analyses to understand how the text content and transferred knowledge help the proposed model.Comment: 11 pages, 7 figures, a full version for the WWW 2019 short pape
    corecore