187 research outputs found

    EMG-to-Speech: Direct Generation of Speech from Facial Electromyographic Signals

    Get PDF
    The general objective of this work is the design, implementation, improvement and evaluation of a system that uses surface electromyographic (EMG) signals and directly synthesizes an audible speech output: EMG-to-speech

    Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    Get PDF
    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

    Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    Get PDF
    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

    Aural Contract: Investigations at the Threshold of Audibility

    Get PDF
    There are many studies dedicated to speech politics, yet the politics of listening remains an underdeveloped area of research. The conditions by which judges, lawyers, police, legislators, and witnesses listen—especially given the increasing employment of forensic audio technologies— deserve closer inspection. This practice-based PhD thesis investigates the political and legal implications of radically new modes of listening, recording, and audio analysis that have emerged since the mid-1980s. It borrows strategies from forensic audio analysis and art to map out the contemporary thresholds of audibility—both human and machinic—as new cultural and political frontiers where issues of subjecthood, citizenship, and testimony are being defined. This thesis is situated at the intersection of art, science, and advocacy, and as such each of the three chapters, together with the methodological introduction, develop their argumentation through a variety of means. The written component develops a historical and theoretical analysis of the ways in which we listen, while in the practice portfolio I test these propositions through both audiovisual artworks and investigative sonic experiments. The textual and practical dimensions are thus mutually constitutive: the historical and theoretical enquiry feeds into the practice, while the practice interrogates and attempts to materially implement these critical assumptions as political audio investigations for human and civil rights. In analysing the thresholds of sound and voice, we recurrently encounter forms of border-crossing, be they material, juridical, sensorial, or conceptual. In Chapter 1 we see the ways in which the voice transgresses the borders between states, both national and ontological. Chapter 2 discusses the blur between foreground and background, sound and noise. In Chapter 3 the way sounds bleed through the walls of a building leads us to the seepage between sound, sight, and touch. The title Aural Contract refers to a shift from the oral to the aural, and from a contract between speaking subjects towards a new set of propositions for the conditions by which we listen to one another and can produce audible evidence. With this shift of analysis from speaking to listening, new modes of political subjectivity emerge; a new spectrum of sounds and silences by which we can make audible those at the threshold of politics—the political prisoner, the colonised, the ghettoised, and the migrant

    Robust visual speech recognition using optical flow analysis and rotation invariant features

    Get PDF
    The focus of this thesis is to develop computer vision algorithms for visual speech recognition system to identify the visemes. The majority of existing speech recognition systems is based on audio-visual signals and has been developed for speech enhancement and is prone to acoustic noise. Considering this problem, aim of this research is to investigate and develop a visual only speech recognition system which should be suitable for noisy environments. Potential applications of such a system include the lip-reading mobile phones, human computer interface (HCI) for mobility-impaired users, robotics, surveillance, improvement of speech based computer control in a noisy environment and for the rehabilitation of the persons who have undergone a laryngectomy surgery. In the literature, there are several models and algorithms available for visual feature extraction. These features are extracted from static mouth images and characterized as appearance and shape based features. However, these methods rarely incorporate the time dependent information of mouth dynamics. This dissertation presents two optical flow based approaches of visual feature extraction, which capture the mouth motions in an image sequence. The motivation for using motion features is, because the human perception of lip-reading is concerned with the temporal dynamics of mouth motion. The first approach is based on extraction of features from the optical flow vertical component. The optical flow vertical component is decomposed into multiple non-overlapping fixed scale blocks and statistical features of each block are computed for successive video frames of an utterance. To overcome the issue of large variation in speed of speech, each utterance is normalized using simple linear interpolation method. In the second approach, four directional motion templates based on optical flow are developed, each representing the consolidated motion information in an utterance in four directions (i.e.,up, down, left and right). This approach is an evolution of a view based approach known as motion history image (MHI). One of the main issues with the MHI method is its motion overwriting problem because of self-occlusion. DMHIs seem to solve this issue of overwriting. Two types of image descriptors, Zernike moments and Hu moments are used to represent each image of DMHIs. A support vector machine (SVM) classifier was used to classify the features obtained from the optical flow vertical component, Zernike and Hu moments separately. For identification of visemes, a multiclass SVM approach was employed. A video speech corpus of seven subjects was used for evaluating the efficiency of the proposed methods for lip-reading. The experimental results demonstrate the promising performance of the optical flow based mouth movement representations. Performance comparison between DMHI and MHI based on Zernike moments, shows that the DMHI technique outperforms the MHI technique. A video based adhoc temporal segmentation method is proposed in the thesis for isolated utterances. It has been used to detect the start and the end frame of an utterance from an image sequence. The technique is based on a pair-wise pixel comparison method. The efficiency of the proposed technique was tested on the available data set with short pauses between each utterance

    A sound takes place : noise, difference and sonorous individuation after Deleuze

    Get PDF
    This thesis traces an idea of auditory influence or sonorous individuation through three distinct areas of sound-art practice. These three areas are discussed according to a kind of spatial contraction, passing from the idea of auditory influence in acoustic ecology and field recording practices, to its expression in work happening at the intersection of soundart and architecture, and finally towards headphonic space and the interior of the body. Through these diverse fields and divergent practices a common idea pertaining to the influence of the auditory upon listening subjects is revealed, which itself brings up questions concerning the constitution of a specifically auditory subjectivity in relation to the subject ‘as a whole’. Towards the expression of a theory of sonorous individuation appropriate to practices approaching sonorous matters in the mode of a sonic materialism, the philosophical work of Gilles Deleuze is called upon as a critical framework. This philosophical framework is adopted as it clearly expresses a spatio-temporally contingent theory of individuation. This particular contingency becomes necessary in exploring works wherein the production of acoustic space is understood as being indissociable from a subjective ‘modulation’ or process of sonorous individuation, in which auditory individuals or listening subjects are bound within and influenced by acoustic spaces in which a sound takes place and a self takes shape.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    • …
    corecore