15 research outputs found

    Randomized Signal Processing with Continuous Frames

    Get PDF
    This paper focuses on signal processing tasks in which the signal is transformed from the signal space to a higher dimensional phase space using a continuous frame, processed in this space, and synthesized to an output signal. For example, in a phase vocoder method, an audio signal is transformed to the time-frequency plane via the short time Fourier transform, manipulated there, and synthesized to an output audio signal. We show how to approximate such methods, termed phase space signal processing methods, using a Monte Carlo method. The Monte Carlo method speeds up computations, since the number of samples required for a certain accuracy is proportional to the dimension of the signal space, and not to the dimension of phase space, which is typically higher. We utilize this property for a new phase vocoder method, based on an enhanced time-frequency space, with more dimensions than the classical method. The higher dimension of phase space improves the quality of the method, while retaining the computational complexity of a standard phase vocoder based on regular samples

    Audio- ja puhesignaalien aika-asteikon muuttaminen

    Get PDF
    In audio time-scale modification (TSM), the duration of an audio recording is changed while retaining its local frequency content. In this thesis, a novel phase vocoder based technique for TSM was developed, which is based on the new concept of fuzzy classification of points in the time-frequency representation of an input signal. The points in the time-frequency representation are classified into three signal classes: tonalness, noisiness, and transientness. The information from the classification is used to preserve the distinct nature of these components during modification. The quality of the proposed method was evaluated by means of a listening test. The proposed method scored slightly higher than a state-of-the-art academic TSM technique, and similarly as a commercial TSM software. The proposed method is suitable for high-quality TSM of a wide variety of audio and speech signals.Ă„Ă€nen aika-asteikon muuttamisessa ÀÀnitteen pituutta muokataan niin, ettĂ€ sen paikallinen taajuussisĂ€ltö sĂ€ilyy samanlaisena. TĂ€ssĂ€ diplomityössĂ€ kehitettiin uusi, vaihevokooderiin pohjautuva menetelmĂ€ ÀÀnen aika-asteikon muuttamiseen. MenetelmĂ€ perustuu ÀÀnen aikataajuusesityksen pisteiden sumeaan luokitteluun. Pisteet luokitellaan soinnillisiksi, kohinaisiksi ja transienttisiksi mÀÀrittĂ€mĂ€llĂ€ jatkuva totuusarvo pisteen kuulumiselle kuhunkin nĂ€istĂ€ luokista. Sumeasta luokittelusta saatua tietoa kĂ€ytetÀÀn hyvĂ€ksi nĂ€iden erilaisten signaalikomponenttien ominaisuuksien sĂ€ilyttĂ€miseen aika-asteikon muuttamisessa. Esitellyn menetelmĂ€n laatua arvioitiin kuuntelukokeen avulla. Esitelty menetelmĂ€ sai kokeessa hieman paremmat pisteet kuin viimeisintĂ€ tekniikkaa edustava akateeminen menetelmĂ€, ja samanlaiset pisteet kuin kaupallinen ohjelmisto. Esitelty menetelmĂ€ soveltuu monenlaisien musiikki- ja puhesignaalien aika-asteikon muuttamiseen

    A transient-preserving audio time-stretching algorithm and a real-time realization for a commercial music product

    Get PDF
    The core of this work is a sub-band transient detection/preservation scheme based on the complex domain transient detection, and inspired by Robel’s work. This proposed technique can be integrated in a real-time phase vocoder analysis/synthesis scheme without introducing latency at relatively low computational cost

    Computationally efficient music synthesis : methods and sound design

    Get PDF
    TÀssÀ diplomityössÀ esitetÀÀn musiikkisyntetisaattorin suunnittelua systeemille, jonka laskentateho ja muistikapasiteetti ovat rajoitettuja. Ensiksi kerrataan mahdollisia synteesitekniikoita sekÀ arvioidaan niiden kÀyttökelpoisuutta laskennallisesti tehokkaassa musiikkisynteesissÀ. KÀytÀnnössÀ kÀyttökelpoiset tekniikat ovat lisÀÀvÀ ja lÀhde-suodinsynteesit, ja erikoistapauksissa taajuusmodulaatio-, aaltotaulukko- ja samplaussynteesit. TÀmÀn jÀlkeen kÀyttökelpoisten tekniikoiden rakenteiden suunnittelua esitetÀÀn tarkemmin, sekÀ esitetÀÀn nÀiden rakenteiden ominaisuuksia ja suunnitteluongelmia. Suurin ongelma kohdataan digitaalisessa lÀhde-suodinsynteesissÀ, jossa klassisten aaltomuotojen, kuten saha-aallon kÀyttö lÀhdesignaalina on ongelmallista laskostumisen takia, joka johtuu aaltomuodossa olevista epÀjatkuvuuksista. Olemassa olevia kaistarajoitettuja aaltomuotosynteesimenetelmiÀ kerrataan, ja polynomimuotoiseen kaistarajoitetuun askelfunktioon perustuvaa menetelmÀÀ esitellÀÀn tarkemmin antamalla suunnittelusÀÀntöjÀ kÀyttökelpoisille polynomeille. MenetelmÀÀ testataan lisÀksi kahdella kolmannen asteen polynomilla. NÀmÀ polynomit vÀhentÀvÀt laskostumista korkeilla taajuuksilla enemmÀn verrattuna ensimmÀisen asteen polynomiin, mutta pienillÀ taajuksilla ensimmÀisen asteen polynomi tuottaa parempia tuloksia. LisÀksi kerrataan muita mahdollisia ÀÀniefektialgoritmeja ja arvioidaan niiden kÀyttökelpoisuutta laskennallisesti tehokkaassa musiikkisynteesissÀ. Useasti ÀÀnisynteesisysteemin tÀytyy pystyÀ generoimaan musiikkia, jossa kÀytetÀÀn monia erilaisia ÀÀniÀ, jotka ulottuvat oikeista akustisista soittimista elektronisiin soittimiin ja luonnon ÀÀniin. Siksi tÀllainen systeemi tarvitsee huolellista ÀÀnten suunnittelua. TÀssÀ diplomityössÀ esitetÀÀn suunnittelusÀÀntöjÀ erilaisten ÀÀnien imitoimiseksi. LisÀksi esitellÀÀn synteesimenetelmien parametrien vaikutus ÀÀnivarianttien suunnitteluun.In this thesis, the design of a music synthesizer for systems suffering from limitations in computing power and memory capacity is presented. First, different possible synthesis techniques are reviewed and their applicability in computationally efficient music synthesis is discussed. In practice, the applicable techniques are limited to additive and source-filter synthesis, and, in special cases, to frequency modulation, wavetable and sampling synthesis. Next, the design of the structures of the applicable techniques are presented in detail, and properties and design issues of these structures are discussed. A major implementation problem is raised in digital source-filter synthesis, where the use of classic waveforms, such as sawtooth wave, as the source signal is challenging due to aliasing caused by waveform discontinuities. Methods for existing bandlimited waveform synthesis are reviewed, and a new approach using polynomial bandlimited step function is presented in detail with design rules for the applicable polynomials. The approach is also tested with two different third-order polynomials. They reduce aliasing more at high frequencies, but at low frequencies their performance is worse than with the first-order polynomial. In addition, some commonly used sound effect algorithms are reviewed with respect to their applicability in computationally efficient music synthesis. In many cases the sound synthesis system must be capable of producing music consisting of various different sounds ranging from real acoustic instruments to electronic instruments and sounds from nature. Therefore, the music synthesis system requires careful sound design. In this thesis, sound design rules for imitation of various sounds using the computationally efficient synthesis techniques are presented. In addition, the effects of the parameter variation for the design of sound variants are presented

    Independent formant and pitch control applied to singing voice

    Get PDF
    Thesis (MScIng)--University of Stellenbosch, 2004.ENGLISH ABSTRACT: A singing voice can be manipulated artificially by means of a digital computer for the purposes of creating new melodies or to correct existing ones. When the fundamental frequency of an audio signal that represents a human voice is changed by simple algorithms, the formants of the voice tend to move to new frequency locations, making it sound unnatural. The main purpose is to design a technique by which the pitch and formants of a singing voice can be controlled independently.AFRIKAANSE OPSOMMING: Onafhanklike formant- en toonhoogte beheer toegepas op ’n sangstem: ’n Sangstem kan deur ’n digitale rekenaar gemanipuleer word om nuwe melodie¹e te skep, of om bestaandes te verbeter. Wanneer die fundamentele frekwensie van ’n klanksein (wat ’n menslike stem voorstel) deur ’n eenvoudige algoritme verander word, skuif die oorspronklike formante na nuwe frekwensie gebiede. Dit veroorsaak dat die resultaat onnatuurlik klink. Die hoof oogmerk is om ’n tegniek te ontwerp wat die toonhoogte en die formante van ’n sangstem apart kan beheer

    Statistical models for natural sounds

    Get PDF
    It is important to understand the rich structure of natural sounds in order to solve important tasks, like automatic speech recognition, and to understand auditory processing in the brain. This thesis takes a step in this direction by characterising the statistics of simple natural sounds. We focus on the statistics because perception often appears to depend on them, rather than on the raw waveform. For example the perception of auditory textures, like running water, wind, fire and rain, depends on summary-statistics, like the rate of falling rain droplets, rather than on the exact details of the physical source. In order to analyse the statistics of sounds accurately it is necessary to improve a number of traditional signal processing methods, including those for amplitude demodulation, time-frequency analysis, and sub-band demodulation. These estimation tasks are ill-posed and therefore it is natural to treat them as Bayesian inference problems. The new probabilistic versions of these methods have several advantages. For example, they perform more accurately on natural signals and are more robust to noise, they can also fill-in missing sections of data, and provide error-bars. Furthermore, free-parameters can be learned from the signal. Using these new algorithms we demonstrate that the energy, sparsity, modulation depth and modulation time-scale in each sub-band of a signal are critical statistics, together with the dependencies between the sub-band modulators. In order to validate this claim, a model containing co-modulated coloured noise carriers is shown to be capable of generating a range of realistic sounding auditory textures. Finally, we explored the connection between the statistics of natural sounds and perception. We demonstrate that inference in the model for auditory textures qualitatively replicates the primitive grouping rules that listeners use to understand simple acoustic scenes. This suggests that the auditory system is optimised for the statistics of natural sounds
    corecore