Search CORE

187 research outputs found

EMG-to-Speech: Direct Generation of Speech from Facial Electromyographic Signals

Author: Janke Matthias
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

The general objective of this work is the design, implementation, improvement and evaluation of a system that uses surface electromyographic (EMG) signals and directly synthesizes an audible speech output: EMG-to-speech

KITopen

Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

Author: Wand Michael
Publication venue: KIT Scientific Publishing
Publication date: 30/07/2019
Field of study

Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

Directory of Open Access Books (DOAB)

Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

Author: Wand Michael
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2014
Field of study

KITopen

Directory of Open Access Books (DOAB)

Coded excitation and sub-band processing for blood velocity estmation in medical ultrasound

Author: Gran Fredrik
Jensen Jørgen Arendt
Udesen Jesper
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2007
Field of study

Online Research Database In Technology

Aural Contract: Investigations at the Threshold of Audibility

Author: Abu Hamdan Lawrence
Publication venue: Goldsmiths, University of London
Publication date
Field of study

There are many studies dedicated to speech politics, yet the politics of listening remains an underdeveloped area of research. The conditions by which judges, lawyers, police, legislators, and witnesses listen—especially given the increasing employment of forensic audio technologies— deserve closer inspection. This practice-based PhD thesis investigates the political and legal implications of radically new modes of listening, recording, and audio analysis that have emerged since the mid-1980s. It borrows strategies from forensic audio analysis and art to map out the contemporary thresholds of audibility—both human and machinic—as new cultural and political frontiers where issues of subjecthood, citizenship, and testimony are being defined. This thesis is situated at the intersection of art, science, and advocacy, and as such each of the three chapters, together with the methodological introduction, develop their argumentation through a variety of means. The written component develops a historical and theoretical analysis of the ways in which we listen, while in the practice portfolio I test these propositions through both audiovisual artworks and investigative sonic experiments. The textual and practical dimensions are thus mutually constitutive: the historical and theoretical enquiry feeds into the practice, while the practice interrogates and attempts to materially implement these critical assumptions as political audio investigations for human and civil rights. In analysing the thresholds of sound and voice, we recurrently encounter forms of border-crossing, be they material, juridical, sensorial, or conceptual. In Chapter 1 we see the ways in which the voice transgresses the borders between states, both national and ontological. Chapter 2 discusses the blur between foreground and background, sound and noise. In Chapter 3 the way sounds bleed through the walls of a building leads us to the seepage between sound, sight, and touch. The title Aural Contract refers to a shift from the oral to the aural, and from a contract between speaking subjects towards a new set of propositions for the conditions by which we listen to one another and can produce audible evidence. With this shift of analysis from speaking to listening, new modes of political subjectivity emerge; a new spectrum of sounds and silences by which we can make audible those at the threshold of politics—the political prisoner, the colonised, the ghettoised, and the migrant

Goldsmiths Research Online

Transducer models in the ultrasound simulation program FIELD II and their accuracy

Author: Bæk David
Jensen Jørgen Arendt
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2010
Field of study

Crossref

Online Research Database In Technology

Robust visual speech recognition using optical flow analysis and rotation invariant features

Author: Shaikh A
Publication venue: RMIT University
Publication date: 01/01/2011
Field of study

The focus of this thesis is to develop computer vision algorithms for visual speech recognition system to identify the visemes. The majority of existing speech recognition systems is based on audio-visual signals and has been developed for speech enhancement and is prone to acoustic noise. Considering this problem, aim of this research is to investigate and develop a visual only speech recognition system which should be suitable for noisy environments. Potential applications of such a system include the lip-reading mobile phones, human computer interface (HCI) for mobility-impaired users, robotics, surveillance, improvement of speech based computer control in a noisy environment and for the rehabilitation of the persons who have undergone a laryngectomy surgery. In the literature, there are several models and algorithms available for visual feature extraction. These features are extracted from static mouth images and characterized as appearance and shape based features. However, these methods rarely incorporate the time dependent information of mouth dynamics. This dissertation presents two optical flow based approaches of visual feature extraction, which capture the mouth motions in an image sequence. The motivation for using motion features is, because the human perception of lip-reading is concerned with the temporal dynamics of mouth motion. The first approach is based on extraction of features from the optical flow vertical component. The optical flow vertical component is decomposed into multiple non-overlapping fixed scale blocks and statistical features of each block are computed for successive video frames of an utterance. To overcome the issue of large variation in speed of speech, each utterance is normalized using simple linear interpolation method. In the second approach, four directional motion templates based on optical flow are developed, each representing the consolidated motion information in an utterance in four directions (i.e.,up, down, left and right). This approach is an evolution of a view based approach known as motion history image (MHI). One of the main issues with the MHI method is its motion overwriting problem because of self-occlusion. DMHIs seem to solve this issue of overwriting. Two types of image descriptors, Zernike moments and Hu moments are used to represent each image of DMHIs. A support vector machine (SVM) classifier was used to classify the features obtained from the optical flow vertical component, Zernike and Hu moments separately. For identification of visemes, a multiclass SVM approach was employed. A video speech corpus of seven subjects was used for evaluating the efficiency of the proposed methods for lip-reading. The experimental results demonstrate the promising performance of the optical flow based mouth movement representations. Performance comparison between DMHI and MHI based on Zernike moments, shows that the DMHI technique outperforms the MHI technique. A video based adhoc temporal segmentation method is proposed in the thesis for isolated utterances. It has been used to detect the start and the end frame of an utterance from an image sequence. The technique is based on a pair-wise pixel comparison method. The efficiency of the proposed technique was tested on the available data set with short pauses between each utterance

RMIT Research Repository

A sound takes place : noise, difference and sonorous individuation after Deleuze

Author: Schrimshaw William
Publication venue
Publication date: 01/01/2011
Field of study

This thesis traces an idea of auditory influence or sonorous individuation through three distinct areas of sound-art practice. These three areas are discussed according to a kind of spatial contraction, passing from the idea of auditory influence in acoustic ecology and field recording practices, to its expression in work happening at the intersection of soundart and architecture, and finally towards headphonic space and the interior of the body. Through these diverse fields and divergent practices a common idea pertaining to the influence of the auditory upon listening subjects is revealed, which itself brings up questions concerning the constitution of a specifically auditory subjectivity in relation to the subject ‘as a whole’. Towards the expression of a theory of sonorous individuation appropriate to practices approaching sonorous matters in the mode of a sonic materialism, the philosophical work of Gilles Deleuze is called upon as a critical framework. This philosophical framework is adopted as it clearly expresses a spatio-temporally contingent theory of individuation. This particular contingency becomes necessary in exploring works wherein the production of acoustic space is understood as being indissociable from a subjective ‘modulation’ or process of sonorous individuation, in which auditory individuals or listening subjects are bound within and influenced by acoustic spaces in which a sound takes place and a self takes shape.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

OpenGrey Repository

Newcastle University eTheses