11 research outputs found

    Music genre visualization and classification exploiting a small set of high-level semantic features

    Get PDF
    In this paper a system for continuous analysis, visualization and classification of musical streams is proposed. The system performs visualization and classification task by means of three high-level, semantic features extracted computing a reduction on a multidimensional low-level feature vector through the usage of Gaussian Mixture Models. The visualization of the semantic characteristics of the audio stream has been implemented by mapping the value of the high-level features on a triangular plot and by assigning to each feature a primary color. In this manner, besides having the representation of musical evolution of the signal, we have also obtained representative colors for each musical part of the analyzed streams. The classification exploits a set of one-against-one threedimensional Support Vector Machines trained on some target genres. The obtained results on visualization and classification tasks are very encouraging: our tests on heterogeneous genre streams have shown the validity of proposed approac

    VoiceCoach: Interactive evidence-based training for voice modulation skills in public speaking

    Get PDF
    The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech. However, it is challenging to master different voice modulation skills. Though many guidelines are available, they are often not practical enough to be applied in different public speaking situations, especially for novice speakers. We present VoiceCoach, an interactive evidence-based approach to facilitate the effective training of voice modulation skills. Specifically, we have analyzed the voice modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use them as the benchmark dataset. Given a voice input, VoiceCoach automatically recommends good voice modulation examples from the dataset based on the similarity of both sentence structures and voice modulation skills. Immediate and quantitative visual feedback is provided to guide further improvement. The expert interviews and the user study provide support for the effectiveness and usability of VoiceCoach.Comment: Accepted by CHI '2

    A multi-level web-based visualization of sound collections

    Get PDF
    Sound collections are becoming more frequently used in our daily lives. A common dilemma is how to present a large sound collection in a concise way but at the same time enable user access to all relevant data. Main objective of this thesis is creation of the system for concise visualization of collections of sound data. To achieve that we have used a two-level methodology for visualization, which means we can move between data clusters on two levels. Sound records are clustered based on similarity between them. For computation of similarities we have used attributes that are retrieved from Essentia library. The advantage of using our methodology is the ability to view data in a manageable way and research the data collection in greater detail due to the multi-level design

    Análise de Soluções para Busca por Similaridade (Matching) de Dados Musicais

    Get PDF
    TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Sistemas de Informação.O som não é algo que podemos ver com nossos olhos. Então, o que é som? O som é a variação da pressão do ar. Sendo assim, a forma de produzir um determinado som depende da maneira como a pressão do ar varia. Representar o som numericamente é chamado de digitalização, ou seja, é representar uma onda sonora (áudio analógico) em código binário (áudio digital). Um processo de transformação e compressão do som, reduzindo o seu tamanho, pois armazenar dados de áudio em formato analógico consome muito espaço. Em paralelo, tornou-se possível codificar arquivos de áudio com mais informação do que apenas o nome do arquivo, utilizando os metadados. Com os metadados é possível nomear, descrever, catalogar e indicar os direitos de autor de um arquivo de áudio digital. Desta forma, um arquivo de áudio digital é composto por metadados e som digitalizado, sendo assim, um dado musical. O dado é a representação física de um evento no tempo e espaço que não agrega fundamento, não podendo ser possível entender o que ele representa ou para que ele existe, porém, ao incluir um “significado” no dado e gerar sentido para quem o ouve e ficando claro ou não a que se refere, é gerada a informação. Uma informação musical apresenta determinadas especificidades de comportamento na sua produção, objetivação e uso. Assim, a música tem diferentes significações para cada indivíduo. A música era um meio de comunicação exclusivamente presencial e com a evolução dos inventos tecnológicos, a música ultrapassa os limites físicos da mídia, mergulhando no universo digital. Desta forma, o problema de representação e o processo de construção de sistemas de processamento e recuperação musicais, agrava-se com a necessidade de desenvolvimento de sistemas com estruturas internas o mais compatível possível com as visões ou desejos dos usuários. Portanto, a relevância deste trabalho contribui diretamente, para agregar conhecimento com o estudo sobre a recuperação da informação musical, que auxiliarão no desenvolvimento futuro de soluções, para busca por similaridade de dados musicais. Especificamente, este trabalho visa apresentar e comparar soluções para recuperação de informação musical. A intenção é analisar soluções que não necessariamente buscam dados musicais apenas através do casamento direto de parâmetros de entrada para a busca, como título da música, palavras-chave ou um áudio com parte da música, mas também através do casamento aproximado (ou similar) destes parâmetros.Sound is something we can't see. So, what is sound? Sound is the variation of air pressure. The way to produce a certain sound depends the air pressure varies. Representing the sound numerically is called scan, it is to represent a sound wave (analog audio) in binary code (digital audio). A process of transformation and compression of sound, reducing its size, since storing audio data in analog format consumes a lot of space. In parallel, it became possible to encode audio files with more information than just the file name, using metadata. With metadata it is possible to name, describe, catalog and indicate the copyright of a digital audio file. In this way, a digital audio file is composed of metadata and digitized sound, thus being a musical data. The data is the physical representation of an event in time and space that does not add a foundation, and it can not be possible to understand what it represents or for what it exists, however, by including a "meaning" in the given and generating meaning for the listener and whether or not it is clear what it refers to, the information is generated. A musical information presents certain specificities of behavior in its production, objectification and use. Thus, music has different meanings for each individual. Music was a means of exclusively on-site communication and with the evolution of technological inventions, music surpasses the physical limits of the media, plunging into the digital universe. In this way, the problem of representation and the process of construction of musical processing and recovery systems is aggravated by the need to develop systems with internal structures as compatible as possible to the visions or desires of the users. Therefore, the relevance of this work contributes, directly, to aggregate knowledge with the study on the retrieval of musical data information that will aid in the future development of solutions for searching for similarity of musical data. Specifically, this work aims to present and compare solutions for music information retrieval. The intention is to analyze solutions that do not necessarily search for musical data only through direct marriage of input parameters to the search, such as song title, keywords or an audio with part of the song, but also through approximate (or similar) these parameters

    Content-based visualisation to aid common navigation of musical audio

    Get PDF

    How Visualization Supports the Daily Work in Traditional Humanities on the Example of Visual Analysis Case Studies

    Get PDF
    Attempts to convince humanities scholars of digital approaches are met with resistance, often. The so-called Digitization Anxiety is the phenomenon that describes the fear of many traditional scientists of being replaced by digital processes. This hinders not only the progress of the scientific domains themselves – since a lot of digital potential is missing – but also makes the everyday work of researchers unnecessarily difficult. Over the past eight years, we have made various attempts to walk the tightrope between 'How can we help traditional humanities to exploit their digital potential?' and 'How can we make them understand that their expertise is not replaced by digital means, but complemented?' We will present our successful interdisciplinary collaborations: How they came about, how they developed, and the problems we encountered. In the first step, we will look at the theoretical basics, which paint a comprehensive picture of the digital humanities and introduces us to the topic of visualization. The field of visualization has shown a special ability: It manages to walk the tightrope and thus keeps digitization anxiety at bay, while not only making it easier for scholars to access their data, but also enabling entirely new research questions. After an introduction to our interdisciplinary collaborations with the Musical Instrument Museum of Leipzig University, as well as with the Bergen-Belsen Memorial, we will present a series of user scenarios that we have collected in the course of 13 publications. These show our cooperation partners solving different research tasks, which we classify using Brehmer and Munzner’s Task Classification. In this way, we show that we provide researchers with a wide range of opportunities: They can answer their traditional research questions – and in some cases verify long-standing hypotheses about the data for the first time – but also develop their own interest in previously impossible, new research questions and approaches. Finally, we conclude our insights on individual collaborative ideas with perspectives on our newest projects. These have risen from the growing interest of collaborators in the methods we deliver. For example, we get insights into the music of real virtuosos of the 20th century. The necessary music storage media can be heard for the first time through digital tools without risking damage to the old material. In addition, we can provide computer-aided analysis capabilities that help musicologists in their work. In the course of the visualization project at the Bergen-Belsen memorial, we will see that what was once a small diary project has grown into a multimodal and international project with institutions of culture and science from eight countries. This is dedicated not only to the question of preserving cultural objects from Nazi persecution contexts but also to modern ways of disseminating and processing knowledge around this context. Finally, we will compile our experience and accumulated knowledge in the form of problems and challenges at the border between computer science and traditional humanities. These will serve as preparation and assistance for future and current interested parties of such interdisciplinary collaborative project

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010
    corecore