6 research outputs found

    Speech, time-frequency representations

    Get PDF
    This paper presents a review on the use of time frequency representations in the fields of speech analysis and automatic speech processing . Three main groups of methods are considered : speech production based methods, general signal analysis methods, auditory-based methods . After this review, some short conclusions on their carrent use, and on some possible future evolutions are proposed .Le propos de cet article est de présenter une bibliographie récente sur l'utilisation des méthodes de représentation temps-fréquence en analyse et en traitement automatique de la parole. Les méthodes sont classées en trois grandes familles: méthodes dérivées de la production, méthodes d'analyse du signal, méthodes modélisant la perception. AprÚs ce panorama, quelques rapides conclusions sur l'état actuel de l'utilisation de ces méthodes, et quelques perspectives sont tentée

    Approche informée pour l'analyse du son et de la musique

    Get PDF
    En traitement du signal audio, l analyse est une Ă©tape essentielle permettant de comprendre et d inter-agir avec les signaux existants. En effet, la qualitĂ© des signaux obtenus par transformation ou par synthĂšse des paramĂštres estimĂ©s dĂ©pend de la prĂ©cision des estimateurs utilisĂ©s. Cependant, des limitations thĂ©oriques existent et dĂ©montrent que la qualitĂ© maximale pouvant ĂȘtre atteinte avec une approche classique peut s avĂ©rer insuf sante dans les applications les plus exigeantes (e.g. Ă©coute active de la musique). Le travail prĂ©sentĂ© dans cette thĂšse revisite certains problĂšmes d analyse usuels tels que l analyse spectrale, la transcription automatique et la sĂ©paration de sources en utilisant une approche dite informĂ©e . Cette nouvelle approche exploite la con guration des studios de musique actuels qui maitrisent la chaĂźne de traitement avant l Ă©tape de crĂ©ation du mĂ©lange. Dans les solutions proposĂ©es, de l information complĂ©mentaire minimale calculĂ©e est transmise en mĂȘme temps que le signal de mĂ©lange a n de permettre certaines transformations sur celui-ci tout en garantissant le niveau de qualitĂ©. Lorsqu une compatibilitĂ© avec les formats audio existants est nĂ©cessaire, cette information est cachĂ©e Ă  l intĂ©rieur du mĂ©lange lui-mĂȘme de maniĂšre inaudible grĂące au tatouage audionumĂ©rique. Ce travail de thĂšse prĂ©sente de nombreux aspects thĂ©oriques et pratiques dans lesquels nous montrons que la combinaison d un estimateur avec de l information complĂ©mentaire permet d amĂ©liorer les performances des approches usuelles telles que l estimation non informĂ©e ou le codage pur.In the field of audio signal processing, analysis is an essential step which allows interactions with existing signals. In fact, the quality of transformed or synthesized audio signals depends on the accuracy over the estimated model parameters. However, theoretical limits exist and show that the best accuracy which can be reached by a classic estimator can be insufficient for the most demanding applications (e.g. active listening of music). The work which is developed in this thesis revisits well known audio analysis problems like spectral analysis, automatic transcription of music and audio sources separation using the novel informed'' approach. This approach takes advantage of a specific configuration where the parameters of the elementary signals which compose a mixture are known before the mixing process. Using the tools which are proposed in this thesis, the minimal side information is computed and transmitted with the mixture signal. This allows any kind of transformation of the mixture signal with a constraint over the resulting quality. When the compatibility with existing audio formats is required, the side information is embedded directly into the analyzed audio signal using a watermarking technique. This work describes several theoretical and practical aspects of audio signal processing. We show that a classic estimator combined with the sufficient side information can obtain better performances than classic approaches (classic estimation or pure coding).BORDEAUX1-Bib.electronique (335229901) / SudocSudocFranceF

    Représentation du signal de parole par une somme de fonctions élémentaires

    No full text
    A new method for speech signal representation, based on elementary waveforms,is proposed and its application in the field of automatic speech processingis discussed.The decomposition of the speech signal into a set of well-localized time-frequencydiscrete elements was studied from three viewpoints. First, formal considerationsand algorithms are derived by an interpretation of classical non-parametricmethods (short-time Fourier and wavelet transform) as a time-domain elementarywaveform representation. Second, the relationship between a particular mode ofelementary waveform representation, the granular analysis, and an analysis of theacoustic signal produced by models of the peripheral auditory system is explored.Finally, a model-based elementary waveform speech representation, is presentedand used in an automatic analysis-synthesis system.Some applications using this new representation for speech analysis are proposed.Speech synthesis was the first aim of this work, and an original structure ofsynthesis, including an automatic system for parameters estimation, is describedand discussed.Parmi les méthodes de représentation du signal, pour le traitement automatiquede la parole, la représentation par un ensemble discret d'événements spectro-temporelslocalisés ou formes d'ondes élémentaires offre des perspectives prometteuses.Une approche triple préside à l'étude menée dans ce mémoire. D'abord, lesconditions et les algorithmes pour une reconstruction exacte du signal depuis sareprésentation se déduisent de l'interprétation de méthodes non-paramétriquesclassiques, transformée de Fourier à court terme et transformée en ondelettes,en terme de formes d'ondes élémentaires dans le domaine temporel. Les relationsentre l'analyse du signal par des modÚles fonctionnels du systÚme auditifpériphérique et l'analyse granulaire, mode particulier de représentation en formesd'ondes élémentaires sont ensuite étudiées. Enfin, une représentation en formesd'ondes élémentaires basée sur un modÚle de production du signal de parole estdétaillée, ainsi qu'un procédé d'analyse/synthÚse automatique.Des applications à l'analyse de la parole, qui s'appuient sur ce nouveau modede représentation, sont envisagées. Pour la synthÚse de parole, principale motivationde ce travail, une structure originale de synthÚse et un procédé d'obtentionautomatique des paramÚtres sont proposés et discutés

    Bases cĂ©rĂ©brales de la perception auditive simple et complexe dans l’autisme

    Full text link
    La perception est dĂ©crite comme l’ensemble des processus permettant au cerveau de recueillir et de traiter l’information sensorielle. Un traitement perceptif atypique se retrouve souvent associĂ© au phĂ©notype autistique habituellement dĂ©crit en termes de dĂ©ficits des habilitĂ©s sociales et de communication ainsi que par des comportements stĂ©rĂ©otypĂ©s et intĂ©rĂȘts restreints. Les particularitĂ©s perceptives des autistes se manifestent Ă  diffĂ©rents niveaux de traitement de l’information; les autistes obtiennent des performances supĂ©rieures Ă  celles des non autistes pour discriminer des stimuli simples, comme des sons purs, ou encore pour des tĂąches de plus haut niveau comme la dĂ©tection de formes enchevĂȘtrĂ©es dans une figure complexe. SpĂ©cifiquement pour le traitement perceptif de bas niveau, on rapporte une dissociation de performance en vision. En effet, les autistes obtiennent des performances supĂ©rieures pour discriminer les stimuli dĂ©finis par la luminance et infĂ©rieures pour les stimuli dĂ©finis par la texture en comparaison Ă  des non autistes. Ce pattern dichotomique a menĂ© Ă  l’élaboration d’une hypothĂšse suggĂ©rant que l’étendue (ou complexitĂ©) du rĂ©seau de rĂ©gions corticales impliquĂ©es dans le traitement des stimuli pourrait sous-tendre ces diffĂ©rences comportementales. En effet, les autistes obtiennent des performances supĂ©rieures pour traiter les stimuli visuels entiĂšrement dĂ©codĂ©s au niveau d’une seule rĂ©gion corticale (simples) et infĂ©rieures pour les stimuli dont l’analyse requiert l’implication de plusieurs rĂ©gions corticales (complexes). Un traitement perceptif atypique reprĂ©sente une caractĂ©ristique gĂ©nĂ©rale associĂ©e au phĂ©notype autistique, avec de particularitĂ©s rapportĂ©es tant dans la modalitĂ© visuelle qu’auditive. Étant donnĂ© les parallĂšles entre ces deux modalitĂ©s sensorielles, cette thĂšse vise Ă  vĂ©rifier si l’hypothĂšse proposĂ©e pour expliquer certaines particularitĂ©s du traitement de l’information visuelle peut possiblement aussi caractĂ©riser le traitement de l’information auditive dans l’autisme. Le premier article (Chapitre 2) expose le niveau de performance des autistes, parfois supĂ©rieur, parfois infĂ©rieur Ă  celui des non autistes lors du traitement de l’information auditive et suggĂšre que la complexitĂ© du matĂ©riel auditif Ă  traiter pourrait ĂȘtre en lien avec certaines des diffĂ©rences observĂ©es. Le deuxiĂšme article (Chapitre 3) prĂ©sente une mĂ©ta-analyse quantitative investiguant la reprĂ©sentation au niveau cortical de la complexitĂ© acoustique chez les non autistes. Ce travail confirme l’organisation fonctionnelle hiĂ©rarchique du cortex auditif et permet d’identifier, comme en vision, des stimuli auditifs pouvant ĂȘtre dĂ©finis comme simples et complexes selon l’étendue du rĂ©seau de rĂ©gions corticales requises pour les traiter. Le troisiĂšme article (Chapitre 4) vĂ©rifie l’extension des prĂ©dictions de l’hypothĂšse proposĂ©e en vision au traitement de l’information auditive. SpĂ©cifiquement, ce projet compare les activations cĂ©rĂ©brales sous-tendant le traitement des sons simples et complexes chez des autistes et des non autistes. Tel qu’attendu, les autistes montrent un patron d’activitĂ© atypique en rĂ©ponse aux stimuli complexes, c’est-Ă -dire ceux dont le traitement nĂ©cessitent l’implication de plusieurs rĂ©gions corticales. En bref, l’ensemble des rĂ©sultats suggĂšrent que les prĂ©dictions de l’hypothĂšse formulĂ©e en vision peuvent aussi s’appliquer en audition et possiblement expliquer certaines particularitĂ©s du traitement de l’information auditive dans l’autisme. Ce travail met en lumiĂšre des diffĂ©rences fondamentales du traitement perceptif contribuant Ă  une meilleure comprĂ©hension des mĂ©canismes d’acquisition de l’information dans cette population.Perception involves the processes allowing the brain to extract and understand sensory information. Atypical perceptual processing has been associated with the autistic phenotype usually described in terms of impairments in social and communication abilities, as well as restricted interests and repetitive behaviours. Perceptual atypicalities are reported across a range of tasks. For instance, superior performance in autistics compared to non autistics is observed for pure tone discrimination as well as for complex figure disembodying tasks. One particular study reported atypical low-level visual processing in autism. In this experiment, autistics displayed enhanced performance for identifying the orientation of luminance-defined gratings and inferior performance for texture-defined gratings in comparison to non autistics. This dichotomous pattern led to the formulation of a hypothesis suggesting an inverse relation between the level of performance and the extent (or complexity) of the cortical network required for processing the stimuli. Specifically, autistics would perform better than non autistics during processing visual stimuli involving one cortical region (luminance-defined or simple stimuli), while they would show decreased performance for processing stimuli involving a network of cortical region (texture-defined or complex stimuli). Atypical perceptual processing is described as a general feature associated with the autistic phenotype and is reported for both the visual and the auditory modalities. Considering the existing parallels between the two sensory modalities, the principal purpose of the presented doctoral dissertation it to verify whether the hypothesis proposed to explain atypical visual processing in autism could also apply to audition. The first article (Chapter 2) is an exhaustive literature review of studies on autistics’ auditory processing abilities. Taken together, the results suggest that the level of performance of autistics on auditory tasks could be related to the acoustic complexity of the stimuli. The second article (Chapter 3) uses quantitative meta-analysis to investigate how auditory complexity is represented at the cortical level in non autistics. This study confirms the hierarchical functional organization of the auditory cortex and allows defining simple and complex auditory stimuli based on the extent of the cortical network involved in their processing, as it was done in vision. The third article (Chapter 4) verifies if the predictions of the hypothesis proposed in vision could also apply in audition. Specifically, this study examines the cortical auditory response to simple and complex sounds in autistics and non autistics. As expected, autistics display atypical cortical activity in response to complex auditory material that is stimuli involving a network of multiple cortical regions to be processed. In sum, the studies in this dissertation indicate that the predictions of the hypothesis proposed in vision could extend to audition and possibly explain some of the atypical behaviours related to auditory processing in autism. This thesis demonstrates fundamentally different auditory cortical processing in autistics that could help define a general model of perceptual differences in autism which could represent a key factor in the understanding of information acquisition

    Inner Song Phenomenological Description of a Musical Object of Phantasy

    Get PDF
    This dissertation is the phenomenological description of a musical object of phantasy I call “inner song,” i.e., the music that the musician “sings in his or her head” while practicing his or her instrument. It describes the specific inner song of a single musician playing a melodic instrument, and rehearsing in a solipsistic situation. The description is based on three resources: my personal experience as a cellist; the third person experiences of other musicians I have interviewed on that topic since 2010; and the Husserlian corpus. Each chapter starts with excerpts of interviews focusing on specific aspects the inner song. Within each chapter, each section starts with a short, italicized description of my experience as a cellist. An introduction defines the terms and explains the methodology. The description unfolds in five chapters: first, it describes the double epochĂ© through which the musician switches from the natural attitude into the musician’s attitude, a form of phenomenological attitude; it then describes the presentification of the inner song in phantasy, and the presentation of the actual song in perception; thirdly, it explains the various layers of the embodied ego playing the musical instrument, perceiving the performance, and phantasizing the inner song; finally, the last two chapters describe the process of spatialization and temporalization of the inner song, as well as its unfolding in consciousness through the perception of its realization in performance. The conclusion opens research possibilities focusing on the inner song itself, or on using the inner song to explore consciousness

    Library buildings around the world

    Get PDF
    "Library Buildings around the World" is a survey based on researches of several years. The objective was to gather library buildings on an international level starting with 1990
    corecore