Search CORE

15 research outputs found

Randomized Signal Processing with Continuous Frames

Author: Avron Haim
Levie Ron
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/09/2020
Field of study

This paper focuses on signal processing tasks in which the signal is transformed from the signal space to a higher dimensional phase space using a continuous frame, processed in this space, and synthesized to an output signal. For example, in a phase vocoder method, an audio signal is transformed to the time-frequency plane via the short time Fourier transform, manipulated there, and synthesized to an output audio signal. We show how to approximate such methods, termed phase space signal processing methods, using a Monte Carlo method. The Monte Carlo method speeds up computations, since the number of samples required for a certain accuracy is proportional to the dimension of the signal space, and not to the dimension of phase space, which is typically higher. We utilize this property for a new phase vocoder method, based on an enhanced time-frequency space, with more dimensions than the classical method. The higher dimension of phase space improves the quality of the method, while retaining the computational complexity of a standard phase vocoder based on regular samples

arXiv.org e-Print Archive

Open Access LMU

Audio- ja puhesignaalien aika-asteikon muuttaminen

Author: Damskägg Eero-Pekka
Publication venue
Publication date: 12/02/2018
Field of study

In audio time-scale modification (TSM), the duration of an audio recording is changed while retaining its local frequency content. In this thesis, a novel phase vocoder based technique for TSM was developed, which is based on the new concept of fuzzy classification of points in the time-frequency representation of an input signal. The points in the time-frequency representation are classified into three signal classes: tonalness, noisiness, and transientness. The information from the classification is used to preserve the distinct nature of these components during modification. The quality of the proposed method was evaluated by means of a listening test. The proposed method scored slightly higher than a state-of-the-art academic TSM technique, and similarly as a commercial TSM software. The proposed method is suitable for high-quality TSM of a wide variety of audio and speech signals.Äänen aika-asteikon muuttamisessa äänitteen pituutta muokataan niin, että sen paikallinen taajuussisältö säilyy samanlaisena. Tässä diplomityössä kehitettiin uusi, vaihevokooderiin pohjautuva menetelmä äänen aika-asteikon muuttamiseen. Menetelmä perustuu äänen aikataajuusesityksen pisteiden sumeaan luokitteluun. Pisteet luokitellaan soinnillisiksi, kohinaisiksi ja transienttisiksi määrittämällä jatkuva totuusarvo pisteen kuulumiselle kuhunkin näistä luokista. Sumeasta luokittelusta saatua tietoa käytetään hyväksi näiden erilaisten signaalikomponenttien ominaisuuksien säilyttämiseen aika-asteikon muuttamisessa. Esitellyn menetelmän laatua arvioitiin kuuntelukokeen avulla. Esitelty menetelmä sai kokeessa hieman paremmat pisteet kuin viimeisintä tekniikkaa edustava akateeminen menetelmä, ja samanlaiset pisteet kuin kaupallinen ohjelmisto. Esitelty menetelmä soveltuu monenlaisien musiikki- ja puhesignaalien aika-asteikon muuttamiseen

Aaltodoc Publication Archive

A transient-preserving audio time-stretching algorithm and a real-time realization for a commercial music product

Author
Publication venue
Publication date
Field of study

The core of this work is a sub-band transient detection/preservation scheme based on the complex domain transient detection, and inspired by Robel’s work. This proposed technique can be integrated in a real-time phase vocoder analysis/synthesis scheme without introducing latency at relatively low computational cost

Padua Thesis and Dissertation Archive

Computationally efficient music synthesis : methods and sound design

Author: Pekonen Jussi
Publication venue: Teknillinen korkeakoulu
Publication date: 01/01/2007
Field of study

Tässä diplomityössä esitetään musiikkisyntetisaattorin suunnittelua systeemille, jonka laskentateho ja muistikapasiteetti ovat rajoitettuja. Ensiksi kerrataan mahdollisia synteesitekniikoita sekä arvioidaan niiden käyttökelpoisuutta laskennallisesti tehokkaassa musiikkisynteesissä. Käytännössä käyttökelpoiset tekniikat ovat lisäävä ja lähde-suodinsynteesit, ja erikoistapauksissa taajuusmodulaatio-, aaltotaulukko- ja samplaussynteesit. Tämän jälkeen käyttökelpoisten tekniikoiden rakenteiden suunnittelua esitetään tarkemmin, sekä esitetään näiden rakenteiden ominaisuuksia ja suunnitteluongelmia. Suurin ongelma kohdataan digitaalisessa lähde-suodinsynteesissä, jossa klassisten aaltomuotojen, kuten saha-aallon käyttö lähdesignaalina on ongelmallista laskostumisen takia, joka johtuu aaltomuodossa olevista epäjatkuvuuksista. Olemassa olevia kaistarajoitettuja aaltomuotosynteesimenetelmiä kerrataan, ja polynomimuotoiseen kaistarajoitetuun askelfunktioon perustuvaa menetelmää esitellään tarkemmin antamalla suunnittelusääntöjä käyttökelpoisille polynomeille. Menetelmää testataan lisäksi kahdella kolmannen asteen polynomilla. Nämä polynomit vähentävät laskostumista korkeilla taajuuksilla enemmän verrattuna ensimmäisen asteen polynomiin, mutta pienillä taajuksilla ensimmäisen asteen polynomi tuottaa parempia tuloksia. Lisäksi kerrataan muita mahdollisia ääniefektialgoritmeja ja arvioidaan niiden käyttökelpoisuutta laskennallisesti tehokkaassa musiikkisynteesissä. Useasti äänisynteesisysteemin täytyy pystyä generoimaan musiikkia, jossa käytetään monia erilaisia ääniä, jotka ulottuvat oikeista akustisista soittimista elektronisiin soittimiin ja luonnon ääniin. Siksi tällainen systeemi tarvitsee huolellista äänten suunnittelua. Tässä diplomityössä esitetään suunnittelusääntöjä erilaisten äänien imitoimiseksi. Lisäksi esitellään synteesimenetelmien parametrien vaikutus äänivarianttien suunnitteluun.In this thesis, the design of a music synthesizer for systems suffering from limitations in computing power and memory capacity is presented. First, different possible synthesis techniques are reviewed and their applicability in computationally efficient music synthesis is discussed. In practice, the applicable techniques are limited to additive and source-filter synthesis, and, in special cases, to frequency modulation, wavetable and sampling synthesis. Next, the design of the structures of the applicable techniques are presented in detail, and properties and design issues of these structures are discussed. A major implementation problem is raised in digital source-filter synthesis, where the use of classic waveforms, such as sawtooth wave, as the source signal is challenging due to aliasing caused by waveform discontinuities. Methods for existing bandlimited waveform synthesis are reviewed, and a new approach using polynomial bandlimited step function is presented in detail with design rules for the applicable polynomials. The approach is also tested with two different third-order polynomials. They reduce aliasing more at high frequencies, but at low frequencies their performance is worse than with the first-order polynomial. In addition, some commonly used sound effect algorithms are reviewed with respect to their applicability in computationally efficient music synthesis. In many cases the sound synthesis system must be capable of producing music consisting of various different sounds ranging from real acoustic instruments to electronic instruments and sounds from nature. Therefore, the music synthesis system requires careful sound design. In this thesis, sound design rules for imitation of various sounds using the computationally efficient synthesis techniques are presented. In addition, the effects of the parameter variation for the design of sound variants are presented

Aaltodoc Publication Archive

Independent formant and pitch control applied to singing voice

Author: Calitz Wietsche Roets
Publication venue: Stellenbosch : University of Stellenbosch
Publication date: 01/12/2004
Field of study

Thesis (MScIng)--University of Stellenbosch, 2004.ENGLISH ABSTRACT: A singing voice can be manipulated artificially by means of a digital computer for the purposes of creating new melodies or to correct existing ones. When the fundamental frequency of an audio signal that represents a human voice is changed by simple algorithms, the formants of the voice tend to move to new frequency locations, making it sound unnatural. The main purpose is to design a technique by which the pitch and formants of a singing voice can be controlled independently.AFRIKAANSE OPSOMMING: Onafhanklike formant- en toonhoogte beheer toegepas op ’n sangstem: ’n Sangstem kan deur ’n digitale rekenaar gemanipuleer word om nuwe melodie¨e te skep, of om bestaandes te verbeter. Wanneer die fundamentele frekwensie van ’n klanksein (wat ’n menslike stem voorstel) deur ’n eenvoudige algoritme verander word, skuif die oorspronklike formante na nuwe frekwensie gebiede. Dit veroorsaak dat die resultaat onnatuurlik klink. Die hoof oogmerk is om ’n tegniek te ontwerp wat die toonhoogte en die formante van ’n sangstem apart kan beheer

Stellenbosch University SUNScholar Repository

Statistical models for natural sounds

Author: Turner R.E.
Publication venue: UCL (University College London)
Publication date: 01/01/2010
Field of study

It is important to understand the rich structure of natural sounds in order to solve important tasks, like automatic speech recognition, and to understand auditory processing in the brain. This thesis takes a step in this direction by characterising the statistics of simple natural sounds. We focus on the statistics because perception often appears to depend on them, rather than on the raw waveform. For example the perception of auditory textures, like running water, wind, fire and rain, depends on summary-statistics, like the rate of falling rain droplets, rather than on the exact details of the physical source. In order to analyse the statistics of sounds accurately it is necessary to improve a number of traditional signal processing methods, including those for amplitude demodulation, time-frequency analysis, and sub-band demodulation. These estimation tasks are ill-posed and therefore it is natural to treat them as Bayesian inference problems. The new probabilistic versions of these methods have several advantages. For example, they perform more accurately on natural signals and are more robust to noise, they can also fill-in missing sections of data, and provide error-bars. Furthermore, free-parameters can be learned from the signal. Using these new algorithms we demonstrate that the energy, sparsity, modulation depth and modulation time-scale in each sub-band of a signal are critical statistics, together with the dependencies between the sub-band modulators. In order to validate this claim, a model containing co-modulated coloured noise carriers is shown to be capable of generating a range of realistic sounding auditory textures. Finally, we explored the connection between the statistics of natural sounds and perception. We demonstrate that inference in the model for auditory textures qualitatively replicates the primitive grouping rules that listeners use to understand simple acoustic scenes. This suggests that the auditory system is optimised for the statistics of natural sounds

CiteSeerX

UCL Discovery