1,710 research outputs found

    Final Research Report for Sound Design and Audio Player

    Get PDF
    This deliverable describes the work on Task 4.3 Algorithms for sound design and feature developments for audio player. The audio player runs on the in-store player (ISP) and takes care of rendering the music playlists via beat-synchronous automatic DJ mixing, taking advantage of the rich musical content description extracted in T4.2 (beat markers, structural segmentation into intro and outro, musical and sound content classification). The deliverable covers prototypes and final results on: (1) automatic beat-synchronous mixing by beat alignment and time stretching – we developed an algorithm for beat alignment and scheduling of time-stretched tracks; (2) compensation of play duration changes introduced by time stretching – in order to make the playlist generator independent of beat mixing, we chose to readjust the tempo of played tracks such that their stretched duration is the same as their original duration; (3) prospective research on the extraction of data from DJ mixes – to alleviate the lack of extensive ground truth databases of DJ mixing practices, we propose steps towards extracting this data from existing mixes by alignment and unmixing of the tracks in a mix. We also show how these methods can be evaluated even without labelled test data, and propose an open dataset for further research; (4) a description of the software player module, a GUI-less application to run on the ISP that performs streaming of tracks from disk and beat-synchronous mixing. The estimation of cue points where tracks should cross-fade is now described in D4.7 Final Research Report on Auto-Tagging of Music.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D

    Final Research Report on Auto-Tagging of Music

    Get PDF
    The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5. The research work on auto-tagging has concentrated on four aspects: 1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3. 2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4. 3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5. 4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D

    Usability of Musical Digital Libraries: a Multimodal Analysis.

    Get PDF
    There has been substantial research on technical aspects of musical digital libraries, but comparatively little on usability aspects. We have evaluated four web-accessible music libraries, focusing particularly on features that are particular to music libraries, such as music retrieval mechanisms. Although the original focus of the work was on how modalities are combined within the interactions with such libraries, that was not where the main difficulties were found. Libraries were generally well designed for use of different modalities. The main challenges identified relate to the details of melody matching and to simplifying the choices of file format. These issues are discussed in detail. 1

    Gesture-Controlled Interaction with Aesthetic Information Sonification

    Full text link
    Information representation in augmented and virtual reality systems, and social physical (building) spaces can enhance the efficacy of interacting with and assimilating abstract, non-visual data. Sanification is the process of automatically generated real time information representation. There is a gap in our implementation and knowledge of auditory display systems used to enhance interaction in virtual and augmented reality. This paper addresses that gap by examining methodologies for mapping socio-spatial data to spatialised sanification manipulated with gestural controllers. This is a system of interactive knowledge representation that completes the human integration loop, enabling the user to interact with and manipulate data using 3D spatial gesture and 3D auditory display. Benefits include 1) added immersion in an augmented or virtual reality interface; 2) auditory display avoids visual overload in visually-saturated processes such as designing, evacuation in emergencies, flying aircraft; computer gaming; and 3) bi-modal or auditory representation, due to its time-based character, facilitates cognition of complex information

    Digital waveguide modeling for wind instruments: building a state-space representation based on the Webster-Lokshin model

    Get PDF
    This paper deals with digital waveguide modeling of wind instruments. It presents the application of state-space representations for the refined acoustic model of Webster-Lokshin. This acoustic model describes the propagation of longitudinal waves in axisymmetric acoustic pipes with a varying cross-section, visco-thermal losses at the walls, and without assuming planar or spherical waves. Moreover, three types of discontinuities of the shape can be taken into account (radius, slope and curvature). The purpose of this work is to build low-cost digital simulations in the time domain based on the Webster-Lokshin model. First, decomposing a resonator into independent elementary parts and isolating delay operators lead to a Kelly-Lochbaum network of input/output systems and delays. Second, for a systematic assembling of elements, their state-space representations are derived in discrete time. Then, standard tools of automatic control are used to reduce the complexity of digital simulations in the time domain. The method is applied to a real trombone, and results of simulations are presented and compared with measurements. This method seems to be a promising approach in term of modularity, complexity of calculation and accuracy, for any acoustic resonators based on tubes
    • 

    corecore