239 research outputs found

    Transient and steady-state component separation for audio signals

    Get PDF
    In this work the problem of transient and steady-state component separation of an audio signal was addressed. In particular, a recently proposed method for separation of transient and steady-state components based on the median filter was investigated. For a better understanding of the processes involved, a modification of the filtering stage of the algorithm was proposed. This modification was evaluated subjectively by listening tests and objectively by an application-based comparison. Also some extensions to the model were presented in conjunction with different possible applications for the transient and steady-state decomposition in the area of audio editing and processing

    Finding Music in Chaos: Designing and Composing with Virtual Instruments Inspired by Chaotic Equations

    Get PDF
    Using chaos theory to design novel audio synthesis engines has been explored little in computer music. This could be because of the difficulty of obtaining harmonic tones or the likelihood of chaos-based synthesis engines to explode, which then requires re-instantiating of the engine to proceed with sound production. This process is not desirable when composing because of the time wasted fixing the synthesis engine instead of the composer being able to focus completely on the creative aspects of composition. One way to remedy these issues is to connect chaotic equations to individual parts of the synthesis engine instead of relying on the chaos as the primary source of all sound-producing procedures. To do this, one can create a physically-based synthesis model and connect chaotic equations to individual parts of the model. The goal of this project is to design a physically-inspired virtual instrument based on a conceptual percussion instrument model that utilizes chaos theory in the synthesis engine to explore novel sounds in a reliable and repeatable way for other composers and performers to use. This project presents a two-movement composition utilizing these concepts and a modular set of virtual instruments that can be used by anyone, which can be interacted with by a new electronic music controller called the Hexapad controller and standard MIDI controllers. The physically-inspired instrument created for the Hexapad controller is called the Ambi-Drum and standard MIDI controllers are used to control synthesis parameters and other virtual instruments

    Separation of musical sources and structure from single-channel polyphonic recordings

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Real-time Sound Source Separation For Music Applications

    Get PDF
    Sound source separation refers to the task of extracting individual sound sources from some number of mixtures of those sound sources. In this thesis, a novel sound source separation algorithm for musical applications is presented. It leverages the fact that the vast majority of commercially recorded music since the 1950s has been mixed down for two channel reproduction, more commonly known as stereo. The algorithm presented in Chapter 3 in this thesis requires no prior knowledge or learning and performs the task of separation based purely on azimuth discrimination within the stereo field. The algorithm exploits the use of the pan pot as a means to achieve image localisation within stereophonic recordings. As such, only an interaural intensity difference exists between left and right channels for a single source. We use gain scaling and phase cancellation techniques to expose frequency dependent nulls across the azimuth domain, from which source separation and resynthesis is carried out. The algorithm is demonstrated to be state of the art in the field of sound source separation but also to be a useful pre-process to other tasks such as music segmentation and surround sound upmixing

    Final Research Report on Auto-Tagging of Music

    Get PDF
    The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5. The research work on auto-tagging has concentrated on four aspects: 1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3. 2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4. 3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5. 4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D

    Pitch-Informed Solo and Accompaniment Separation

    Get PDF
    ï»żDas Thema dieser Dissertation ist die Entwicklung eines Systems zur Tonhöhen-informierten Quellentrennung von Musiksignalen in Soloinstrument und Begleitung. Dieses ist geeignet, die dominanten Instrumente aus einem MusikstĂŒck zu isolieren, unabhĂ€ngig von der Art des Instruments, der Begleitung und Stilrichtung. Dabei werden nur einstimmige Melodieinstrumente in Betracht gezogen. Die Musikaufnahmen liegen monaural vor, es kann also keine zusĂ€tzliche Information aus der Verteilung der Instrumente im Stereo-Panorama gewonnen werden. Die entwickelte Methode nutzt Tonhöhen-Information als Basis fĂŒr eine sinusoidale Modellierung der spektralen Eigenschaften des Soloinstruments aus dem Musikmischsignal. Anstatt die spektralen Informationen pro Frame zu bestimmen, werden in der vorgeschlagenen Methode Tonobjekte fĂŒr die Separation genutzt. Tonobjekt-basierte Verarbeitung ermöglicht es, zusĂ€tzlich die NotenanfĂ€nge zu verfeinern, transiente Artefakte zu reduzieren, gemeinsame Amplitudenmodulation (Common Amplitude Modulation CAM) einzubeziehen und besser nichtharmonische Elemente der Töne abzuschĂ€tzen. Der vorgestellte Algorithmus zur Quellentrennung von Soloinstrument und Begleitung ermöglicht eine Echtzeitverarbeitung und ist somit relevant fĂŒr den praktischen Einsatz. Ein Experiment zur besseren Modellierung der ZusammenhĂ€nge zwischen Magnitude, Phase und Feinfrequenz von isolierten Instrumententönen wurde durchgefĂŒhrt. Als Ergebnis konnte die KontinuitĂ€t der zeitlichen EinhĂŒllenden, die InharmonizitĂ€t bestimmter Musikinstrumente und die Auswertung des Phasenfortschritts fĂŒr die vorgestellte Methode ausgenutzt werden. ZusĂ€tzlich wurde ein Algorithmus fĂŒr die Quellentrennung in perkussive und harmonische Signalanteile auf Basis des Phasenfortschritts entwickelt. Dieser erreicht ein verbesserte perzeptuelle QualitĂ€t der harmonischen und perkussiven Signale gegenĂŒber vergleichbaren Methoden nach dem Stand der Technik. Die vorgestellte Methode zur Klangquellentrennung in Soloinstrument und Begleitung wurde zu den Evaluationskampagnen SiSEC 2011 und SiSEC 2013 eingereicht. Dort konnten vergleichbare Ergebnisse im Hinblick auf perzeptuelle Bewertungsmaße erzielt werden. Die QualitĂ€t eines Referenzalgorithmus im Hinblick auf den in dieser Dissertation beschriebenen Instrumentaldatensatz ĂŒbertroffen werden. Als ein Anwendungsszenario fĂŒr die Klangquellentrennung in Solo und Begleitung wurde ein Hörtest durchgefĂŒhrt, der die QualitĂ€tsanforderungen an Quellentrennung im Kontext von Musiklernsoftware bewerten sollte. Die Ergebnisse dieses Hörtests zeigen, dass die Solo- und Begleitspur gemĂ€ĂŸ unterschiedlicher QualitĂ€tskriterien getrennt werden sollten. Die Musiklernsoftware Songs2See integriert die vorgestellte Klangquellentrennung bereits in einer kommerziell erhĂ€ltlichen Anwendung.This thesis addresses the development of a system for pitch-informed solo and accompaniment separation capable of separating main instruments from music accompaniment regardless of the musical genre of the track, or type of music accompaniment. For the solo instrument, only pitched monophonic instruments were considered in a single-channel scenario where no panning or spatial location information is available. In the proposed method, pitch information is used as an initial stage of a sinusoidal modeling approach that attempts to estimate the spectral information of the solo instrument from a given audio mixture. Instead of estimating the solo instrument on a frame by frame basis, the proposed method gathers information of tone objects to perform separation. Tone-based processing allowed the inclusion of novel processing stages for attack refinement, transient interference reduction, common amplitude modulation (CAM) of tone objects, and for better estimation of non-harmonic elements that can occur in musical instrument tones. The proposed solo and accompaniment algorithm is an efficient method suitable for real-world applications. A study was conducted to better model magnitude, frequency, and phase of isolated musical instrument tones. As a result of this study, temporal envelope smoothness, inharmonicty of musical instruments, and phase expectation were exploited in the proposed separation method. Additionally, an algorithm for harmonic/percussive separation based on phase expectation was proposed. The algorithm shows improved perceptual quality with respect to state-of-the-art methods for harmonic/percussive separation. The proposed solo and accompaniment method obtained perceptual quality scores comparable to other state-of-the-art algorithms under the SiSEC 2011 and SiSEC 2013 campaigns, and outperformed the comparison algorithm on the instrumental dataset described in this thesis.As a use-case of solo and accompaniment separation, a listening test procedure was conducted to assess separation quality requirements in the context of music education. Results from the listening test showed that solo and accompaniment tracks should be optimized differently to suit quality requirements of music education. The Songs2See application was presented as commercial music learning software which includes the proposed solo and accompaniment separation method

    Applying source separation to music

    Get PDF
    International audienceSeparation of existing audio into remixable elements is very useful to repurpose music audio. Applications include upmixing video soundtracks to surround sound (e.g. home theater 5.1 systems), facilitating music transcriptions, allowing better mashups and remixes for disk jockeys, and rebalancing sound levels on multiple instruments or voices recorded simultaneously to a single track. In this chapter, we provide an overview of the algorithms and approaches designed to address the challenges and opportunities in music. Where applicable, we also introduce commonalities and links to source separation for video soundtracks, since many musical scenarios involve video soundtracks (e.g. YouTube recordings of live concerts, movie sound tracks). While space prohibits describing every method in detail, we include detail on representative music‐specific algorithms and approaches not covered in other chapters. The intent is to give the reader a high‐level understanding of the workings of key exemplars of the source separation approaches applied in this domain

    Towards Real-Time Non-Stationary Sinusoidal Modelling of Kick and Bass Sounds for Audio Analysis and Modification

    Get PDF
    Sinusoidal Modelling is a powerful and flexible parametric method for analysing and processing audio signals. These signals have an underlying structure that modern spectral models aim to exploit by separating the signal into sinusoidal, transient, and noise components. Each of these can then be modelled in a manner most appropriate to that component's inherent structure. The accuracy of the estimated parameters is directly related to the quality of the model's representation of the signal, and the assumptions made about its underlying structure. For sinusoidal models, these assumptions generally affect the non-stationary estimates related to amplitude and frequency modulations, and the type of amplitude change curve. This is especially true when using a single analysis frame in a non-overlapping framework, where biased estimates can result in discontinuities at frame boundaries. It is therefore desirable for such a model to distinguish between the shape of different amplitude changes and adapt the estimation of this accordingly. Intra-frame amplitude change can be interpreted as a change in the windowing function applied to a stationary sinusoid, which can be estimated from the derivative of the phase with respect to frequency at magnitude peaks in the DFT spectrum. A method for measuring monotonic linear amplitude change from single-frame estimates using the first-order derivative of the phase with respect to frequency (approximated by the first-order difference) is presented, along with a method of distinguishing between linear and exponential amplitude change. An adaption of the popular matching pursuit algorithm for refining model parameters in a segmented framework has been investigated using a dictionary comprised of sinusoids with parameters varying slightly from model estimates, based on Modelled Pursuit (MoP). Modelling of the residual signal using a segmented undecimated Wavelet Transform (segUWT) is presented. A generalisation for both the forward and inverse transforms, for delay compensations and overlap extensions for different lengths of Wavelets and the number of decomposition levels in an Overlap Save (OLS) implementation for dealing with convolution block-based artefacts is presented. This shift invariant implementation of the DWT is a popular tool for de-noising and shows promising results for the separation of transients from noise

    Towards an interactive framework for robot dancing applications

    Get PDF
    Estågio realizado no INESC-Porto e orientado pelo Prof. Doutor Fabien GouyonTese de mestrado integrado. Engenharia Electrotécnica e de Computadores - Major TelecomunicaçÔes. Faculdade de Engenharia. Universidade do Porto. 200
    • 

    corecore