1,710 research outputs found
Final Research Report for Sound Design and Audio Player
This deliverable describes the work on Task 4.3 Algorithms for sound design and feature developments for audio player. The audio player runs on the in-store player (ISP) and takes care of rendering the music playlists via beat-synchronous automatic DJ mixing, taking advantage of the rich musical content description extracted in T4.2 (beat markers, structural segmentation into intro and outro, musical and sound content classification).
The deliverable covers prototypes and final results on: (1) automatic beat-synchronous mixing by beat alignment and time stretching â we developed an algorithm for beat alignment and scheduling of time-stretched tracks; (2) compensation of play duration changes introduced by time stretching â in order to make the playlist generator independent of beat mixing, we chose to readjust the tempo of played tracks such that their stretched duration is the same as their original duration; (3) prospective research on the extraction of data from DJ mixes â to alleviate the lack of extensive ground truth databases of DJ mixing practices, we propose steps towards extracting this data from existing mixes by alignment and unmixing of the tracks in a mix. We also show how these methods can be evaluated even without labelled test data, and propose an open dataset for further research; (4) a description of the software player module, a GUI-less application to run on the ISP that performs streaming of tracks from disk and beat-synchronous mixing.
The estimation of cue points where tracks should cross-fade is now described in D4.7 Final Research Report on Auto-Tagging of Music.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D
Final Research Report on Auto-Tagging of Music
The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the âauto-tagging of musicâ. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5.
The research work on auto-tagging has concentrated on four aspects:
1) Further improving IRCAMâs machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! âsoftâ features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3.
2) Developing two sets of âhardâ features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4.
3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5.
4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D
Recommended from our members
Corpus-Based Transcription as an Approach to the Compositional Control of Timbre
Timbre space is a cognitive model useful to address the problem of structuring timbre in electronic music. The recent concept of corpus-based concatenative sound synthesis is proposed as an approach to timbral control in both real- and deferred-time applications. Using CataRT and related tools in the FTM and Gabor libraries for Max/MSP we describe a technique for real-time analysis of a live signal to pilot corpus-based synthesis, along with examples of compositional realizations in works for instruments, electronics, and sound installation. To extend this technique to computer-assisted composition for acoustic instruments, we develop tools using the Sound Description Interchange Format (SDIF) to export sonic descriptors to OpenMusic where they may be further manipulated and transcribed into an instrumental score. This presents a flexible technique for the compositional organization of noise-based instrumental sounds
Usability of Musical Digital Libraries: a Multimodal Analysis.
There has been substantial research on technical aspects of musical digital libraries, but comparatively little on usability aspects. We have evaluated four web-accessible music libraries, focusing particularly on features that are particular to music libraries, such as music retrieval mechanisms. Although the original focus of the work was on how modalities are combined within the interactions with such libraries, that was not where the main difficulties were found. Libraries were generally well designed for use of different modalities. The main challenges identified relate to the details of melody matching and to simplifying the choices of file format. These issues are discussed in detail. 1
Recommended from our members
Collaborative computer music composition and the emergence of the computer music designer
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London.This submission explores the development of collaborative computer music creation and the
role of the Musical Assistant, or Computer Music Designer, or Live Electronics Designer, or
RIM (RĂ©alisateur en informatique musicale) and does so primarily through the consideration of
a series of collaborations with composers over the last 18 years. The submission documents
and evaluates a number of projects which exemplify my practice within collaborative computer
music creation, whether in the form of live electronics, tape-based or fixed media work, as a
live electronics performer, or working with composers and others to create original tools and
music for artistic creations. A selection of works is presented to exemplify archetypes found
within the relational structures of collaborative work.
The relatively recent development of this activity as an independent metier is located within its
historical context, a context in which my work has played a significant role. The submission
evidences the innovative aspects of that work and, more generally, of the role of the Computer
Music Designer through consideration of a number of Max patches and program examples
especially created for the works under discussion. Finally, the validation of the role of the
Computer Music Designer as a new entity within the world of music creation is explored in a
range of contexts, demonstrating the ways in which Computer Music Designers not only
collaborate in the creation of new work but also generate new resources for computer-based
music and new creative paradigms
Gesture-Controlled Interaction with Aesthetic Information Sonification
Information representation in augmented and virtual reality systems, and social physical (building) spaces can enhance the efficacy of interacting with and assimilating abstract, non-visual data. Sanification is the process of automatically generated real time information representation. There is a gap in our implementation and knowledge of auditory display systems used to enhance interaction in virtual and augmented reality. This paper addresses that gap by examining methodologies for mapping socio-spatial data to spatialised sanification manipulated with gestural controllers. This is a system of interactive knowledge representation that completes the human integration loop, enabling the user to interact with and manipulate data using 3D spatial gesture and 3D auditory display. Benefits include 1) added immersion in an augmented or virtual reality interface; 2) auditory display avoids visual overload in visually-saturated processes such as designing, evacuation in emergencies, flying aircraft; computer gaming; and 3) bi-modal or auditory representation, due to its time-based character, facilitates cognition of complex information
Digital waveguide modeling for wind instruments: building a state-space representation based on the Webster-Lokshin model
This paper deals with digital waveguide modeling of wind instruments. It presents the application of state-space representations for the refined acoustic model of Webster-Lokshin. This acoustic model describes the propagation of longitudinal waves in axisymmetric acoustic pipes with a varying cross-section, visco-thermal losses at the walls, and without assuming planar or spherical waves. Moreover, three types of discontinuities of the shape can be taken into account (radius, slope and curvature).
The purpose of this work is to build low-cost digital simulations in the time domain based on the Webster-Lokshin model. First, decomposing a resonator into independent elementary parts and isolating delay operators lead to a Kelly-Lochbaum network of input/output systems and delays. Second, for a systematic assembling of elements, their state-space representations are derived in discrete time. Then, standard tools of automatic control are used to reduce the complexity of digital simulations in the time domain. The method is applied to a real trombone, and results of simulations are presented and compared with measurements. This method seems to be a promising approach in term of modularity, complexity of calculation and accuracy, for any acoustic resonators based on tubes
- âŠ