Search CORE

74 research outputs found

Music Information Retrieval in Live Coding: A Theoretical Framework

Author: Freeman Jason
Lerch Alexander
Xambo Anna
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2018
Field of study

The work presented in this article has been partly conducted while the first author was at Georgia Tech from 2015–2017 with the support of the School of Music, the Center for Music Technology and Women in Music Tech at Georgia Tech. Another part of this research has been conducted while the first author was at Queen Mary University of London from 2017–2019 with the support of the AudioCommons project, funded by the European Commission through the Horizon 2020 programme, research and innovation grant 688382. The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Music information retrieval (MIR) has a great potential in musical live coding because it can help the musician–programmer to make musical decisions based on audio content analysis and explore new sonorities by means of MIR techniques. The use of real-time MIR techniques can be computationally demanding and thus they have been rarely used in live coding; when they have been used, it has been with a focus on low-level feature extraction. This article surveys and discusses the potential of MIR applied to live coding at a higher musical level. We propose a conceptual framework of three categories: (1) audio repurposing, (2) audio rewiring, and (3) audio remixing. We explored the three categories in live performance through an application programming interface library written in SuperCollider, MIRLC. We found that it is still a technical challenge to use high-level features in real time, yet using rhythmic and tonal properties (midlevel features) in combination with text-based information (e.g., tags) helps to achieve a closer perceptual level centered on pitch and rhythm when using MIR in live coding. We discuss challenges and future directions of utilizing MIR approaches in the computer music field

De Montfort University Open Research Archive

NORA - Norwegian Open Research Archives

Real-time audiovisual and interactive applications for desktop and mobile platforms

Author: Ferreira Inês Vale
Publication venue
Publication date: 01/01/2012
Field of study

Tese de mestrado integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201

Repositório Aberto da Universidade do Porto

Recommended from our members

Roadmap for Music Information ReSearch

Author: Benetos E.
Chudy M.
Dixon S.
Flexer A.
Gomez E.
Gouyon F.
Herrera P.
Jorda S.
Magas M.
Paytuvi O.
Peeters G.
Schlüter J.
Serra X.
Vinet H.
Widmer G.
Publication venue: MIRES Consortium
Publication date: 01/01/2013
Field of study

City Research Online

UPF Digital Repository

Soundscape Generation Using Web Audio Archives

Author: Paulo Jorge Fernandes Teixeira
Publication venue
Publication date: 17/07/2019
Field of study

Os grandes e crescentes acervos de áudio na web têm transformado a prática do design de som. Neste contexto, sampling -- uma ferramenta essencial do design de som -- mudou de gravações mecânicas para os domínios da cópia e reprodução no computador. A navegação eficaz nos grandes acervos e a recuperação de conteúdo tornaram-se um problema bem identificado em Music Information Retrieval, nomeadamente através da adoção de metodologias baseadas no conteúdo do áudio.Apesar da sua robustez e eficácia, as soluções tecnológicas atuais assentam principalmente em métodos (estatísticos) de processamento de sinal, cuja terminologia atinge um nível de adequação centrada no utilizador.Esta dissertação avança uma nova estratégia orientada semanticamente para navegação e recuperação de conteúdo de áudio, em particular, sons ambientais, a partir de grandes acervos de áudio na web. Por fim, pretendemos simplificar a extração de pedidos definidos pelo utilizador para promover uma geração fluida de paisagens sonoras. No nosso trabalho, os pedidos aos acervos de áudio na web são feitos por dimensões afetivas que se relacionam com estados emocionais (exemplo: baixa ativação e baixa valência) e descrições semânticas das fontes de áudio (exemplo: chuva). Para tal, mapeamos as anotações humanas das dimensões afetivas para descrições espectrais de áudio extraídas do conteúdo do sinal. A extração de novos sons dos acervos da web é feita estipulando um pedido que combina um ponto num plano afetivo bidimensional e tags semânticas. A aplicação protótipo, MScaper, implementa o método no ambiente Ableton Live. A avaliação da nossa pesquisa avaliou a confiabilidade perceptual dos descritores espectrais de áudio na captura de dimensões afetivas e a usabilidade da MScaper. Os resultados mostram que as características espectrais do áudio capturam significativamente as dimensões afetivas e que o MScaper foi entendido pelos os utilizadores experientes como tendo excelente usabilidade.The large and growing archives of audio content on the web have been transforming the sound design practice. In this context, sampling -- a fundamental sound design tool -- has shifted from mechanical recording to the realms of the copying and cutting on the computer. To effectively browse these large archives and retrieve content became a well-identified problem in Music Information Retrieval, namely through the adoption of audio content-based methodologies. Despite its robustness and effectiveness, current technological solutions rely mostly on (statistical) signal processing methods, whose terminology do attain a level of user-centered explanatory adequacy.This dissertation advances a novel semantically-oriented strategy for browsing and retrieving audio content, in particular, environmental sounds, from large web audio archives. Ultimately, we aim to streamline the retrieval of user-defined queries to foster a fluid generation of soundscapes. In our work, querying web audio archives is done by affective dimensions that relate to emotional states (e.g., low arousal and low valence) and semantic audio source descriptions (e.g., rain). To this end, we map human annotations of affective dimensions to spectral audio-content descriptions extracted from the signal content. Retrieving new sounds from web archives is then made by specifying a query which combines a point in a 2-dimensional affective plane and semantic tags. A prototype application, MScaper, implements the method in the Ableton Live environment. An evaluation of our research assesses the perceptual soundness of the spectral audio-content descriptors in capturing affective dimensions and the usability of MScaper. The results show that spectral audio features significantly capture affective dimensions and that MScaper has been perceived by expert-users as having excellent usability

Repositório Aberto da Universidade do Porto

Recommended from our members

Investigating the cognitive foundations of collaborative musical free improvisation: Experimental case studies using a novel application of the subsumption architecture

Author: Linson Adam
Publication venue
Publication date: 04/04/2014
Field of study

This thesis investigates the cognitive foundations of collaborative musical free improvisation. To explore the cognitive underpinnings of the collaborative process, a series of experimental case studies was undertaken in which expert improvisors performed with an artificial agent. The research connects ecological musicology and subsumption robotics, and builds upon insights from empirical psychology pertaining to the attribution of intentionality. A distinguishing characteristic of free improvisation is that no over-arching framework of formal musical conventions defines it, and it cannot be positively identified by sound alone, which poses difficulties for traditional musicology. Current musicological research has begun to focus on the social dimension of music, including improvisation. Ecological psychology, which focuses on the relation of cognition to agent–environment dynamics using the notion of affordances, has been shown to be a promising approach to understanding musical improvisation. This ecological approach to musicology makes it possible to address the subjective and social aspects of improvised music, as opposed to the common treatment of music as objective and neutral. The subjective dimension of musical listening has been highlighted in music cognition studies of cue abstraction, whereby listeners perceive emergent structures while listening to certain forms of music when no structures are identified in advance. These considerations informed the design of the artificial agent, Odessa, used for this study. In contrast to traditional artificial intelligence (AI), which tends to view the world as objective and neutral, behaviour-based robotics historically developed around ideas similar to those of ecological psychology, focused on agent–environment dynamics and the ability to deal with potentially rapidly changing environments. Behaviour-based systems that are designed using the subsumption architecture are robust and flexible in virtue of their modular, decentralised design comprised of simple interactions between simple mechanisms. The competence of such agents is demonstrated on the basis of their interaction with the environment and ability to cope with unknown and dynamic conditions, which suggests the concept of improvisation. This thesis documents a parsimonious subsumption design for an agent that performs musical free improvisation with human co-performers, as well as the experimental studies conducted with this agent. The empirical component examines the human experience of collaborating with the agent and, more generally, the cognitive psychology of collaborative improvisation. The design was ultimately successful, and yielded insights about cognition in collaborative improvisation, in particular, concerning the central relationship between perceived intentionality and affordances. As a novel application of the subsumption architecture, this research contributes to AI/robotics and to research on interactive improvisation systems. It also contributes to music psychology and cognition, as well as improvisation studies, through its empirical grounding of an ecological model of musical interaction

Open Research Online (The Open University)

Workset Creation for Scholarly Analysis: Prototyping Project

Author: Cole Timothy
Downie J. Stephen
Plale Beth
Publication venue
Publication date: 15/03/2013
Field of study

Scholars rely on library collections to support their scholarship. Out of these collections, scholars select, organize, and refine the worksets that will answer to their particular research objectives. The requirements for those worksets are becoming increasingly sophisticated and complex, both as humanities scholarship has become more interdisciplinary and as it has become more digital. The HathiTrust is a repository that centrally collects image and text representations of library holdings digitized by the Google Books project and other mass-digitization efforts. The HathiTrust's computational infrastructure is being built to support large-scale manipulation and preservation of these representations, but it organizes them according to catalog records that were created to enable users to find books in a building or to make high-level generalizations about duplicate holdings across libraries, etc. These catalog records were never meant to support the granularity of sorting and selection or works that scholars now expect, much less page-level or chapter-level sorting and selection out of a corpus of billions of pages. The ability to slice through a massive corpus consisting of many different library collections, and out of that to construct the precise workset required for a particular scholarly investigation, is the “game changing” potential of the HathiTrust; understanding how to do that is a research problem, and one that is keenly of interest to the HathiTrust Research Center (HTRC), since we believe that scholarship begins with the selection of appropriate resources. Given the unprecedented size and scope of the HathiTrust corpus—in conjunction with the HTRC’s unique computational access to copyrighted materials—we are proposing a project that will engage scholars in designing tools for exploration, location, and analytic grouping of materials so they can routinely conduct computational scholarship at scale, based on meaningful worksets. “Workset Creation for Scholarly Analysis: Prototyping Project” (WCSA) seeks to address three sets of tightly intertwined research questions regarding 1) enriching the metadata in the HathiTrust corpus, 2) augmenting string-based metadata with URIs to leverage discovery and sharing through external services, and 3) formalizing the notion of collections and worksets in the context of the HathiTrust Research Center. Building upon the model of the Open Annotation Collaboration, the HTRC proposes to release an open, competitive Request for Proposals with the intent to fund four prototyping projects that will build tools for enriching and augmenting metadata for the HathiTrust corpus. Concurrently, the HTRC will work closely with the Center for Informatics Research in Science and Scholarship (CIRSS) to develop and instantiate a set of formal data models that will be used to capture and integrate the outputs of the funded prototyping projects with the larger HathiTrust corpus.Andrew W. Mellon Foundation, grant no. 21300666Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

Music similarity analysis using the big data framework spark

Author: Schoder Johannes
Publication venue
Publication date: 01/01/2019
Field of study

A parameterizable recommender system based on the Big Data processing framework Spark is introduced, which takes multiple tonal properties of music into account and is capable of recommending music based on a user's personal preferences. The implemented system is fully scalable; more songs can be added to the dataset, the cluster size can be increased, and the possibility to add different kinds of audio features and more state-of-the-art similarity measurements is given. This thesis also deals with the extraction of the required audio features in parallel on a computer cluster. The extracted features are then processed by the Spark based recommender system, and song recommendations for a dataset consisting of approximately 114000 songs are retrieved in less than 12 seconds on a 16 node Spark cluster, combining eight different audio feature types and similarity measurements.Ein parametrisierbares Empfehlungssystem, basierend auf dem Big Data Framework Spark, wird präsentiert. Dieses berücksichtigt verschiedene klangliche Eigenschaften der Musik und erstellt Musikempfehlungen basierend auf den persönlichen Vorlieben eines Nutzers. Das implementierte Empfehlungssystem ist voll skalierbar. Mehr Lieder können dem Datensatz hinzugefügt werden, mehr Rechner können in das Computercluster eingebunden werden und die Möglichkeit andere Audiofeatures und aktuellere Ähnlichkeitsmaße hizuzufügen und zu verwenden, ist ebenfalls gegeben. Des Weiteren behandelt die Arbeit die parallele Berechnung der benötigten Audiofeatures auf einem Computercluster. Die Features werden von dem auf Spark basierenden Empfehlungssystem verarbeitet und Empfehlungen für einen Datensatz bestehend aus ca. 114000 Liedern können unter Berücksichtigung von acht verschiedenen Arten von Audiofeatures und Abstandsmaßen innerhalb von zwölf Sekunden auf einem Computercluster mit 16 Knoten berechnet werden

Digitale Bibliothek Thüringen

EASA : Environment Aware Social Agent

Author: Bâgcı Furkan Burak
Cakmak Hüseyin
Cengiz Kübra
Gilmartin Emer
Haddad Kevin El
Haider Fasih
Khaki Hossein
Kılı Vedat Gazi
Leroy Julien
Marighetto Pierre
Marzban Shabbir
Pulisci Roberto
Riche Nicolas
Sezer Hilal
Sulir Martin
Torre Ilaria
Türker Bekir Berker
Yazıcı Ramazan
Yenge Sena Büsra
Publication venue
Publication date: 22/01/2018
Field of study

Edinburgh Research Explorer

LC: A Mostly-strongly-timed Prototype-based Computer Music Programming Language that Integrates Objects and Manipulations for Microsound Synthesis

Author: HIROKI NISHINO
Publication venue
Publication date: 21/01/2014
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Visualizing Music Collections Based on Metadata: Concepts, User Studies and Design Implications

Author: Holm Jukka
Publication venue: Tampere University of Technology
Publication date: 01/01/2012
Field of study

Modern digital music services and applications enable easy access to vast online and local music collections. To differentiate from their competitors, software developers should aim to design novel, interesting, entertaining, and easy-to-use user interfaces (UIs) and interaction methods for accessing the music collections. One potential approach is to replace or complement the textual lists with static, dynamic, adaptive, and/or interactive visualizations of selected musical attributes. A well-designed visualization has the potential to make interaction with a service or an application an entertaining and intuitive experience, and it can also improve the usability and efficiency of the system. This doctoral thesis belongs to the intersection of the fields of human-computer interaction (HCI), music information retrieval (MIR), and information visualization (Infovis). HCI studies the design, implementation and evaluation of interactive computing systems; MIR focuses on the different strategies for helping users seek music or music-related information; and Infovis studies the use of visual representations of abstract data to amplify cognition. The purpose of the thesis is to explore the feasibility of visualizing music collections based on three types of musical metadata: musical genre, tempo, and the release year of the music. More specifically, the research goal is to study which visual variables and structures are best suitable for representing the metadata, and how the visualizations can be used in the design of novel UIs for music player applications, including music recommendation systems. The research takes a user- centered and constructive design-science approach, and covers all the different aspects of interaction design: understanding the users, the prototype design, and the evaluation. The performance of the different visualizations from the user perspective was studied in a series of online surveys with 51-104 (mostly Finnish) participants. In addition to tempo and release year, five different visualization methods (colors, icons, fonts, emoticons and avatars) for representing musical genres were investigated. Based on the results, promising ways to represent tempo include the number of objects, shapes with a varying number of corners, and y-axis location combined with some other visual variable or clear labeling. Promising ways to represent the release year include lightness and the perceived location on the z- or x-axis. In the case of genres, the most successful method was the avatars, which used elements from the other methods and required the most screen estate. In the second part of the thesis, three interactive prototype applications (avatars, potentiometers and a virtual world) focusing on visualizing musical genres were designed and evaluated with 40-41 Finnish participants. While the concepts had great potential for complementing traditional text-based music applications, they were too simple and restricted to replace them in longer-term use. Especially the lack of textual search functionality was seen as a major shortcoming. Based on the results of the thesis, it is possible to design recognizable, acceptable, entertaining, and easy-to-use (especially genre) visualizations with certain limitations. Important factors include, e.g., the used metadata vocabulary (e.g., set of musical genres) and visual variables/structures; preferred music discovery mode; available screen estate; and the target culture of the visualizations

Trepo - Institutional Repository of Tampere University