11 research outputs found
Music genre visualization and classification exploiting a small set of high-level semantic features
In this paper a system for continuous analysis, visualization and classification of musical streams is proposed. The system performs visualization and classification task by means of three high-level, semantic features extracted computing a reduction on a multidimensional low-level feature vector through the usage of Gaussian Mixture Models. The visualization of the semantic characteristics of the audio stream has been implemented by mapping the value of the high-level features on a triangular plot and by assigning to each feature a primary color. In this manner, besides having the representation of musical evolution of the signal, we have also obtained representative colors for each musical part of the analyzed streams. The classification exploits a set of one-against-one threedimensional Support Vector Machines trained on some target genres.
The obtained results on visualization and classification tasks are very encouraging: our tests on heterogeneous genre streams have shown the validity of proposed approac
Making large music collections accessible using enhanced metadata and lightweight visualizations
Abstrac
VoiceCoach: Interactive evidence-based training for voice modulation skills in public speaking
The modulation of voice properties, such as pitch, volume, and speed, is
crucial for delivering a successful public speech. However, it is challenging
to master different voice modulation skills. Though many guidelines are
available, they are often not practical enough to be applied in different
public speaking situations, especially for novice speakers. We present
VoiceCoach, an interactive evidence-based approach to facilitate the effective
training of voice modulation skills. Specifically, we have analyzed the voice
modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use
them as the benchmark dataset. Given a voice input, VoiceCoach automatically
recommends good voice modulation examples from the dataset based on the
similarity of both sentence structures and voice modulation skills. Immediate
and quantitative visual feedback is provided to guide further improvement. The
expert interviews and the user study provide support for the effectiveness and
usability of VoiceCoach.Comment: Accepted by CHI '2
A multi-level web-based visualization of sound collections
Sound collections are becoming more frequently used in our daily lives. A common dilemma is how to present a large sound collection in a concise way but at the same time enable user access to all relevant data. Main objective of this thesis is creation of the system for concise visualization of collections of sound data. To achieve that we have used a two-level methodology for visualization, which means we can move between data clusters on two levels. Sound records are clustered based on similarity between them. For computation of similarities we have used attributes that are retrieved from Essentia library. The advantage of using our methodology is the ability to view data in a manageable way and research the data collection in greater detail due to the multi-level design
Análise de Soluções para Busca por Similaridade (Matching) de Dados Musicais
TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Sistemas de Informação.O som não é algo que podemos ver com nossos olhos. Então, o que é som? O som é a variação da pressão do ar. Sendo assim, a forma de produzir um determinado som depende da maneira como a pressão do ar varia. Representar o som numericamente é chamado de digitalização, ou seja, é representar uma onda sonora (áudio analógico) em código binário (áudio digital). Um processo de transformação e compressão do som, reduzindo o seu tamanho, pois armazenar dados de áudio em formato analógico consome muito espaço. Em paralelo, tornou-se possível codificar arquivos de áudio com mais informação do que apenas o nome do arquivo, utilizando os metadados. Com os metadados é possível nomear, descrever, catalogar e indicar os direitos de autor de um arquivo de áudio digital. Desta forma, um arquivo de áudio digital é composto por metadados e som digitalizado, sendo assim, um dado musical. O dado é a representação física de um evento no tempo e espaço que não agrega fundamento, não podendo ser possível entender o que ele representa ou para que ele existe, porém, ao incluir um “significado” no dado e gerar sentido para quem o ouve e ficando claro ou não a que se refere, é gerada a informação. Uma informação musical apresenta determinadas especificidades de comportamento na sua produção, objetivação e uso. Assim, a música tem diferentes significações para cada indivíduo. A música era um meio de comunicação exclusivamente presencial e com a evolução dos inventos tecnológicos, a música ultrapassa os limites físicos da mídia, mergulhando no universo digital. Desta forma, o problema de representação e o processo de construção de sistemas de processamento e recuperação musicais, agrava-se com a necessidade de desenvolvimento de sistemas com estruturas internas o mais compatível possível com as visões ou desejos dos usuários. Portanto, a relevância deste trabalho contribui diretamente, para agregar conhecimento com o estudo sobre a recuperação da informação musical, que auxiliarão no desenvolvimento futuro de soluções, para busca por similaridade de dados musicais. Especificamente, este trabalho visa apresentar e comparar soluções para recuperação de informação musical. A intenção é analisar soluções que não necessariamente buscam dados musicais apenas através do casamento direto de parâmetros de entrada para a busca, como título da música, palavras-chave ou um áudio com parte da música, mas também através do casamento aproximado (ou similar) destes parâmetros.Sound is something we can't see. So, what is sound? Sound is the variation of air pressure. The way to produce a certain sound depends the air pressure varies. Representing the sound numerically is called scan, it is to represent a sound wave (analog audio) in binary code (digital audio). A process of transformation and compression of sound, reducing its size, since storing audio data in analog format consumes a lot of space. In parallel, it became possible to encode audio files with more information than just the file name, using metadata. With metadata it is possible to name, describe, catalog and indicate the copyright of a digital audio file. In this way, a digital audio file is composed of metadata and digitized sound, thus being a musical data. The data is the physical representation of an event in time and space that does not add a foundation, and it can not be possible to understand what it represents or for what it exists, however, by including a "meaning" in the given and generating meaning for the listener and whether or not it is clear what it refers to, the information is generated. A musical information presents certain specificities of behavior in its production, objectification and use. Thus, music has different meanings for each individual. Music was a means of exclusively on-site communication and with the evolution of technological inventions, music surpasses the physical limits of the media, plunging into the digital universe. In this way, the problem of representation and the process of construction of musical processing and recovery systems is aggravated by the need to develop systems with internal structures as compatible as possible to the visions or desires of the users. Therefore, the relevance of this work contributes, directly, to aggregate knowledge with the study on the retrieval of musical data information that will aid in the future development of solutions for searching for similarity of musical data. Specifically, this work aims to present and compare solutions for music information retrieval. The intention is to analyze solutions that do not necessarily search for musical data only through direct marriage of input parameters to the search, such as song title, keywords or an audio with part of the song, but also through approximate (or similar) these parameters
How Visualization Supports the Daily Work in Traditional Humanities on the Example of Visual Analysis Case Studies
Attempts to convince humanities scholars of digital approaches are met with
resistance, often. The so-called Digitization Anxiety is the phenomenon that
describes the fear of many traditional scientists of being replaced by digital
processes. This hinders not only the progress of the scientific domains themselves
– since a lot of digital potential is missing – but also makes the everyday work
of researchers unnecessarily difficult. Over the past eight years, we have
made various attempts to walk the tightrope between 'How can we help
traditional humanities to exploit their digital potential?' and 'How can we
make them understand that their expertise is not replaced by digital means, but
complemented?' We will present our successful interdisciplinary collaborations:
How they came about, how they developed, and the problems we encountered. In
the first step, we will look at the theoretical basics, which paint a comprehensive
picture of the digital humanities and introduces us to the topic of visualization.
The field of visualization has shown a special ability: It manages to walk the
tightrope and thus keeps digitization anxiety at bay, while not only making it
easier for scholars to access their data, but also enabling entirely new research
questions. After an introduction to our interdisciplinary collaborations with
the Musical Instrument Museum of Leipzig University, as well as with the
Bergen-Belsen Memorial, we will present a series of user scenarios that we
have collected in the course of 13 publications. These show our cooperation
partners solving different research tasks, which we classify using Brehmer and
Munzner’s Task Classification. In this way, we show that we provide researchers
with a wide range of opportunities: They can answer their traditional research
questions – and in some cases verify long-standing hypotheses about the data
for the first time – but also develop their own interest in previously impossible,
new research questions and approaches. Finally, we conclude our insights on
individual collaborative ideas with perspectives on our newest projects. These
have risen from the growing interest of collaborators in the methods we deliver.
For example, we get insights into the music of real virtuosos of the 20th century.
The necessary music storage media can be heard for the first time through
digital tools without risking damage to the old material. In addition, we can
provide computer-aided analysis capabilities that help musicologists in their work.
In the course of the visualization project at the Bergen-Belsen memorial, we
will see that what was once a small diary project has grown into a multimodal
and international project with institutions of culture and science from eight
countries. This is dedicated not only to the question of preserving cultural
objects from Nazi persecution contexts but also to modern ways of disseminating
and processing knowledge around this context. Finally, we will compile our
experience and accumulated knowledge in the form of problems and challenges
at the border between computer science and traditional humanities. These will
serve as preparation and assistance for future and current interested parties of
such interdisciplinary collaborative project
Proceedings of the 7th Sound and Music Computing Conference
Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010