7 research outputs found

    Tag-based social image search with visual-text joint hypergraph learning

    Get PDF
    10.1145/2072298.2072054MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops1517-152

    Machine Learning Approach for Genre Prediction on Spotify Top Ranking Songs

    Get PDF
    This paper analyzed the audio features and genres of top ranking songs on Spotify from January to August in 2017. The dataset consists of daily top ranking songs, their audio features and genres. The data was collected from Kaggle.com, Spotify Web API, and Discogs AIPs. Analysis contains summary statistics, principal component analysis, and machine learning classifier implementation and evaluation. The principal component analysis converted nine audio features into three principal components and they are named as sound, words in lyrics, and rhythm according to the description of audio features they include. The machine learning method takes audio features and genres as input and predicts genres for songs in the test set based on their audio features. The classifier achieved 46.9% accuracy which is not as good as expected. Detailed procedures, results and analysis are provided.Master of Science in Information Scienc

    A Comprehensive Review on Audio based Musical Instrument Recognition: Human-Machine Interaction towards Industry 4.0

    Get PDF
    Over the last two decades, the application of machine technology has shifted from industrial to residential use. Further, advances in hardware and software sectors have led machine technology to its utmost application, the human-machine interaction, a multimodal communication. Multimodal communication refers to the integration of various modalities of information like speech, image, music, gesture, and facial expressions. Music is the non-verbal type of communication that humans often use to express their minds. Thus, Music Information Retrieval (MIR) has become a booming field of research and has gained a lot of interest from the academic community, music industry, and vast multimedia users. The problem in MIR is accessing and retrieving a specific type of music as demanded from the extensive music data. The most inherent problem in MIR is music classification. The essential MIR tasks are artist identification, genre classification, mood classification, music annotation, and instrument recognition. Among these, instrument recognition is a vital sub-task in MIR for various reasons, including retrieval of music information, sound source separation, and automatic music transcription. In recent past years, many researchers have reported different machine learning techniques for musical instrument recognition and proved some of them to be good ones. This article provides a systematic, comprehensive review of the advanced machine learning techniques used for musical instrument recognition. We have stressed on different audio feature descriptors of common choices of classifier learning used for musical instrument recognition. This review article emphasizes on the recent developments in music classification techniques and discusses a few associated future research problems

    Toward efficient indexing structure for scalable content-based music retrieval

    Get PDF
    Pretendemos problematizar arte e loucura, inicialmente discutindo a experiência do pesquisador em relação às imagens do mundo, com o testemunho e a figura do louco e, consequentemente, com o fora que ela evoca. Em seguida nos colocamos diante do muro, situação-limite na qual a loucura enquanto catástrofe e a arte enquanto via poética vêm compor um limiar, ausência que Blanchot transpõe à linguagem para dar a ver outras constelações possíveis, tanto de palavras quanto de seus inomináveis. Por fim, com Walter Benjamin, pomos a história da loucura a contrapelo, e, mergulhados no Ateliê de Escrita do Hospital Psiquiátrico São Pedro, desvelamos que a arte pode, na relação com a loucura, tornar-se a linguagem essencial na perigosa travessia em direção à experiência, transpondo a vivência desse estado assustador para trazer ao mundo outro sentido, reconhecendo outros modos de existência que podem vir a ser outras poéticas de vida.We intend to problematize art and madness. We begin by discussing the experience of the researcher in relation to images of the world, to witnessing and to the image of the insane, and then inevitably to the outside they evoke. Subsequently, we stand before a wall, a limit situation in which madness as catastrophe and art as poetics compose a threshold, an absence which Blanchot transposes to language to bring other possible constellations into view, both as words and as their unnamable others. Finally, with Walter Benjamin, we touch upon the grain of the history of madness – immersed in the Writing Workshop at the São Pedro Psychiatric Hospital, in Porto Alegre, Brazil, we reveal that, in relation to madness, art can become the essential language of the perilous passage towards experience, transposing the experience of this horrific state to bring another sense to the world, recognizing other modes of existence which may come to be other poetics of life.Nous désirons problématiser l’art et la folie, initialement en discutant l’expérience du chercheur par rapport aux images du monde, avec le témoignage et l’image du fou, et, par conséquent, l’extérieur qu’elle évoque. Puis, on se pose devant le mur, situation extrême dans laquelle la folie comme catastrophe et l’art comme voie poétique composent un seuil viennent à construire un seuil, absence que Blanchot transpose en langage afin de révéler d’autres constellations possibles tant comme des mots, tant comme ses innombrables. Enfin, avec Walter Benjamin, nous prenons l’histoire de la folie à contre-poil, et plongés dans l’Atelier d’écriture de l’Hôpital psychiatrique de São Pedro, à Porto Alegre au Brésil, nous révélons que l’art, par rapport à la folie, peut devenir le langage essentiel de la traversée dangereuse vers l’expérience, en transposant le vécu de cet état terrifiant, afin de donner un autre sens au monde, tout en reconnaissant d’autres modes d’existence qui pourraient devenir d’autres poétiques de vie.Nuestra intención es de problematizar el arte y la locura, inicialmente discutiendo la experiencia del investigador en relación con las imágenes del mundo, el testimonio y la figura del loco, y por lo tanto con el afuera que ella evoca. Seguidamente, nos ponemos delante de un muro, una situación extrema en la que la locura como catástrofe y el arte como vía poética componen un umbral, una ausencia que Blanchot transpone en lenguaje para revelar las otras constelaciones posibles tanto como palabras, tanto como innombrables otros. Por último, con Walter Benjamin, ponemos la historia de la locura a contra pelo, y sumergidos en el Taller de escritura del Hospital Psiquiátrico São Pedro de Porto Alegre, Brasil, desvelamos que, en relación con la locura, el arte puede convertirse en el lenguaje esencial de ese peligroso pasaje que nos conduce a la experiencia, que transpone lo vivido en este estado aterrador para dar otro sentido al mundo, reconociendo otros modos de existencia que pueden llegar a ser otras poéticas de vida

    HSI: A Novel Framework for Efficient Automated Singer Identification in Large Music Databases

    No full text
    10.1109/ICDE.2006.79Proceedings - International Conference on Data Engineering2006169

    Machine learning and audio processing : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Albany, Auckland, New Zealand

    Get PDF
    In this thesis, we addressed two important theoretical issues in deep neural networks and clustering, respectively. Also, we developed a new approach for polyphonic sound event detection, which is one of the most important applications in the audio processing area. The developed three novel approaches are: (i) The Large Margin Recurrent Neural Network (LMRNN), which improves the discriminative ability of original Recurrent Neural Networks by introducing a large margin term into the widely used cross-entropy loss function. The developed large margin term utilises the large margin discriminative principle as a heuristic term to navigate the convergence process during training, which fully exploits the information from data labels by considering both target category and competing categories. (ii) The Robust Multi-View Continuous Subspace Clustering (RMVCSC) approach, which performs clustering on a common view-invariant subspace learned from all views. The clustering result and the common representation subspace are simultaneously optimised by a single continuous objective function. In the objective function, a robust estimator is used to automatically clip specious inter-cluster connections while maintaining convincing intra-cluster correspondences. Thus, the developed RMVCSC can untangle heavily mixed clusters without pre-setting the number of clusters. (iii) The novel polyphonic sound event detection approach based on Relational Recurrent Neural Network (RRNN), which utilises the relational reasoning ability of RRNNs to untangle the overlapping sound events across audio recordings. Different from previous works, which mixed and packed all historical information into a single common hidden memory vector, the developed approach allows historical information to interact with each other across an audio recording, which is effective and efficient in untangling the overlapping sound events. All three approaches are tested on widely used datasets and compared with recently published works. The experimental results have demonstrated the effectiveness and efficiency of the developed approaches
    corecore