436 research outputs found

    SynthÚse de textures sonores à partir de statistiques temps-fréquence

    No full text
    Sound textures are a wide class of sounds that includes the sound of the rain falling, the hubbub of a crowd and the chirping of flocks of birds. All these sounds present an element of unpredictability which is not commonly sought after in sound synthesis, requiring the use of dedicated algorithms. However, the diverse audio properties of sound textures make the designing of an algorithm able to convincingly recreate varied textures a complex task. This thesis focuses on parametric sound texture synthesis. In this paradigm, a set of summary statistics are extracted from a target texture and iteratively imposed onto a white noise. If the set of statistics is appropriate, the white noise is modified until it resemble the target, sounding as if it had been recorded moments later. In a first part, we propose improvements to perceptual-based parametric method. These improvements aim at making its synthesis of sharp and salient events by mainly altering and simplifying its imposition process. In a second, we adapt a parametric visual texture synthesis method based statistics extracted by a Convolutional Neural Networks (CNN) to work on sound textures. We modify the computation of its statistics to fit the properties of sound signals, alter the architecture of the CNN to best fit audio elements present in sound textures and use a time-frequency representation taking both magnitude and phase into account.Les textures sonores sont une catĂ©gorie de sons incluant le bruit de la pluie, le brouhaha d’une foule ou les pĂ©piements d’un groupe d’oiseaux. Tous ces sons contiennent une part d’imprĂ©visibilitĂ© qui n’est habituellement pas recherchĂ©e en synthĂšse sonore, et rend ainsi indispensable l’utilisation d’algorithmes dĂ©diĂ©s. Cependant, la grande diversitĂ© de leurs propriĂ©tĂ©s complique la crĂ©ation d’un algorithme capable de synthĂ©tiser un large panel de textures. Cette thĂšse s’intĂ©resse Ă  la synthĂšse paramĂ©trique de textures sonores. Dans ce paradigme, un ensemble de statistiques sont extraites d’une texture cible et progressivement imposĂ©es sur un bruit blanc. Si l’ensemble de statistiques est pertinent, le bruit blanc est alors modifiĂ© jusqu’à ressembler Ă  la cible, donnant l’illusion d’avoir Ă©tĂ© enregistrĂ© quelques instants aprĂšs. Dans un premier temps, nous proposons l’amĂ©lioration d’une mĂ©thode paramĂ©trique basĂ©e sur des statistiques perceptuelles. Cette amĂ©lioration vise Ă  amĂ©liorer la synthĂšse d’évĂšnements Ă  forte attaque ou singuliers en modifiant et simplifiant le processus d’imposition. Dans un second temps, nous adaptons une mĂ©thode paramĂ©trique de synthĂšse de textures visuelles basĂ©e sur des statistiques extraites par un rĂ©seau de neurones convolutifs (CNN) afin de l’utiliser sur des textures sonores. Nous modifions l’ensemble de statistiques utilisĂ©es afin de mieux correspondre aux propriĂ©tĂ©s des signaux sonores, changeons l’architecture du CNN pour l’adapter aux Ă©vĂ©nements prĂ©sents dans les textures sonores et utilisons une reprĂ©sentation temps-frĂ©quence prenant en compte Ă  la fois amplitude et phase

    SynthÚse de textures sonores à partir de statistiques temps-fréquence

    No full text
    Les textures sonores sont une catĂ©gorie de sons incluant le bruit de la pluie, le brouhaha d’une foule ou les pĂ©piements d’un groupe d’oiseaux. Tous ces sons contiennent une part d’imprĂ©visibilitĂ© qui n’est habituellement pas recherchĂ©e en synthĂšse sonore, et rend ainsi indispensable l’utilisation d’algorithmes dĂ©diĂ©s. Cependant, la grande diversitĂ© de leurs propriĂ©tĂ©s complique la crĂ©ation d’un algorithme capable de synthĂ©tiser un large panel de textures. Cette thĂšse s’intĂ©resse Ă  la synthĂšse paramĂ©trique de textures sonores. Dans ce paradigme, un ensemble de statistiques sont extraites d’une texture cible et progressivement imposĂ©es sur un bruit blanc. Si l’ensemble de statistiques est pertinent, le bruit blanc est alors modifiĂ© jusqu’à ressembler Ă  la cible, donnant l’illusion d’avoir Ă©tĂ© enregistrĂ© quelques instants aprĂšs. Dans un premier temps, nous proposons l’amĂ©lioration d’une mĂ©thode paramĂ©trique basĂ©e sur des statistiques perceptuelles. Cette amĂ©lioration vise Ă  amĂ©liorer la synthĂšse d’évĂšnements Ă  forte attaque ou singuliers en modifiant et simplifiant le processus d’imposition. Dans un second temps, nous adaptons une mĂ©thode paramĂ©trique de synthĂšse de textures visuelles basĂ©e sur des statistiques extraites par un rĂ©seau de neurones convolutifs (CNN) afin de l’utiliser sur des textures sonores. Nous modifions l’ensemble de statistiques utilisĂ©es afin de mieux correspondre aux propriĂ©tĂ©s des signaux sonores, changeons l’architecture du CNN pour l’adapter aux Ă©vĂ©nements prĂ©sents dans les textures sonores et utilisons une reprĂ©sentation temps-frĂ©quence prenant en compte Ă  la fois amplitude et phase.Sound textures are a wide class of sounds that includes the sound of the rain falling, the hubbub of a crowd and the chirping of flocks of birds. All these sounds present an element of unpredictability which is not commonly sought after in sound synthesis, requiring the use of dedicated algorithms. However, the diverse audio properties of sound textures make the designing of an algorithm able to convincingly recreate varied textures a complex task. This thesis focuses on parametric sound texture synthesis. In this paradigm, a set of summary statistics are extracted from a target texture and iteratively imposed onto a white noise. If the set of statistics is appropriate, the white noise is modified until it resemble the target, sounding as if it had been recorded moments later. In a first part, we propose improvements to perceptual-based parametric method. These improvements aim at making its synthesis of sharp and salient events by mainly altering and simplifying its imposition process. In a second, we adapt a parametric visual texture synthesis method based statistics extracted by a Convolutional Neural Networks (CNN) to work on sound textures. We modify the computation of its statistics to fit the properties of sound signals, alter the architecture of the CNN to best fit audio elements present in sound textures and use a time-frequency representation taking both magnitude and phase into account

    SynthÚse de textures sonores à partir de statistiques temps-fréquence

    No full text
    Sound textures are a wide class of sounds that includes the sound of the rain falling, the hubbub of a crowd and the chirping of flocks of birds. All these sounds present an element of unpredictability which is not commonly sought after in sound synthesis, requiring the use of dedicated algorithms. However, the diverse audio properties of sound textures make the designing of an algorithm able to convincingly recreate varied textures a complex task. This thesis focuses on parametric sound texture synthesis. In this paradigm, a set of summary statistics are extracted from a target texture and iteratively imposed onto a white noise. If the set of statistics is appropriate, the white noise is modified until it resemble the target, sounding as if it had been recorded moments later. In a first part, we propose improvements to perceptual-based parametric method. These improvements aim at making its synthesis of sharp and salient events by mainly altering and simplifying its imposition process. In a second, we adapt a parametric visual texture synthesis method based statistics extracted by a Convolutional Neural Networks (CNN) to work on sound textures. We modify the computation of its statistics to fit the properties of sound signals, alter the architecture of the CNN to best fit audio elements present in sound textures and use a time-frequency representation taking both magnitude and phase into account.Les textures sonores sont une catĂ©gorie de sons incluant le bruit de la pluie, le brouhaha d’une foule ou les pĂ©piements d’un groupe d’oiseaux. Tous ces sons contiennent une part d’imprĂ©visibilitĂ© qui n’est habituellement pas recherchĂ©e en synthĂšse sonore, et rend ainsi indispensable l’utilisation d’algorithmes dĂ©diĂ©s. Cependant, la grande diversitĂ© de leurs propriĂ©tĂ©s complique la crĂ©ation d’un algorithme capable de synthĂ©tiser un large panel de textures. Cette thĂšse s’intĂ©resse Ă  la synthĂšse paramĂ©trique de textures sonores. Dans ce paradigme, un ensemble de statistiques sont extraites d’une texture cible et progressivement imposĂ©es sur un bruit blanc. Si l’ensemble de statistiques est pertinent, le bruit blanc est alors modifiĂ© jusqu’à ressembler Ă  la cible, donnant l’illusion d’avoir Ă©tĂ© enregistrĂ© quelques instants aprĂšs. Dans un premier temps, nous proposons l’amĂ©lioration d’une mĂ©thode paramĂ©trique basĂ©e sur des statistiques perceptuelles. Cette amĂ©lioration vise Ă  amĂ©liorer la synthĂšse d’évĂšnements Ă  forte attaque ou singuliers en modifiant et simplifiant le processus d’imposition. Dans un second temps, nous adaptons une mĂ©thode paramĂ©trique de synthĂšse de textures visuelles basĂ©e sur des statistiques extraites par un rĂ©seau de neurones convolutifs (CNN) afin de l’utiliser sur des textures sonores. Nous modifions l’ensemble de statistiques utilisĂ©es afin de mieux correspondre aux propriĂ©tĂ©s des signaux sonores, changeons l’architecture du CNN pour l’adapter aux Ă©vĂ©nements prĂ©sents dans les textures sonores et utilisons une reprĂ©sentation temps-frĂ©quence prenant en compte Ă  la fois amplitude et phase

    Gradient conversion between time and frequency domains using Wirtinger calculus

    No full text
    International audienceGradient descent algorithms are found in a variety of scientific fields, audio signal processing included. This paper presents a new method of converting any gradient of a cost function with respect to a signal into, or from, a gradient with respect to the spectrum of this signal: thus, it allows the gradient descent to be performed indiscriminately in time or frequency domain. For efficiency purposes, and because the gradient of a real function with respect to a complex signal does not formally exist, this work is performed using Wirtinger calculus. An application to sound texture synthesis then experimentally validates this gradient conversion

    [Monnaie : Bronze, Ariassos, Pisidie, Caracalla]

    No full text
    Appartient à l’ensemble documentaire : MonnGr

    [Monnaie : Bronze, Pergame, Mysie, Caracalla]

    No full text
    Appartient à l’ensemble documentaire : MonnGr

    Sound texture synthesis using Convolutional Neural Networks

    No full text
    International audienceThe following article introduces a new parametric synthesis algorithm for sound textures inspired by existing methods used for visual textures. Using a 2D Convolutional Neural Network (CNN), a sound signal is modified until the temporal cross-correlations of the feature maps of its log-spectrogram resemble those of a target texture. We show that the resulting synthesized sound signal is both different from the original and of high quality, while being able to reproduce singular events appearing in the original. This process is performed in the time domain, discarding the harmful phase recovery step which usually concludes synthesis performed in the time-frequency domain. It is also straightforward and flexible, as it does not require any fine tuning between several losses when synthesizing diverse sound textures. Synthesized spectrograms and sound signals are showcased, and a way of extending the synthesis in order to produce a sound of any length is also presented. We also discuss the choice of CNN, border effects in our synthesized signals and possible ways of modifying the algorithm in order to improve its current long computation time

    [Monnaie : Bronze, Antioche, Pisidie, Caracalla]

    No full text
    Appartient à l’ensemble documentaire : MonnGr
    • 

    corecore