84 research outputs found

    Speech analysis for Ambient Assisted Living : technical and user design of a vocal order system

    No full text
    International audienceEvolution of ICT led to the emergence of smart home. A Smart Home consists in a home equipped with data-processing technology which anticipates the needs of its inhabitant while trying to maintain their comfort and their safety by action on the house and by implementing connections with the outside world. Therefore, smart homes equipped with ambient intelligence technology constitute a promising direction to enable the growing number of elderly to continue to live in their own homes as long as possible. However, the technological solutions requested by this part of the population have to suit their specific needs and capabilities. It is obvious that these Smart Houses tend to be equipped with devices whose interfaces are increasingly complex and become difficult to control by the user. The people the most likely to benefit from these new technologies are the people in loss of autonomy such as the disabled people or the elderly which cognitive deficiencies (Alzheimer). Moreover, these people are the less capable of using the complex interfaces due to their handicap or their lack ICT understanding. Thus, it becomes essential to facilitate the daily life and the access to the whole home automation system through the smart home. The usual tactile interfaces should be supplemented by accessible interfaces, in particular, thanks to a system reactive to the voice ; these interfaces are also useful when the person cannot move easily. Vocal orders will allow the following functionality: - To ensure an assistance by a traditional or vocal order. - To set up a indirect order regulation for a better energy management. - To reinforce the link with the relatives by the integration of interfaces dedicated and adapted to the person in loss of autonomy. - To ensure more safety by detection of distress situations and when someone is breaking in the house. This chapter will describe the different steps which are needed for the conception of an audio ambient system. The first step is related to the acceptability and the objection aspects by the end users and we will report a user evaluation assessing the acceptance and the fear of this new technology. The experience aimed at testing three important aspects of speech interaction: voice command, communication with the outside world, home automation system interrupting a person's activity. The experiment was conducted in a smart home with a voice command using a Wizard of OZ technique and gave information of great interest. The second step is related to a general presentation of the audio sensing technology for ambient assisted living. Different aspect of sound and speech processing will be developed. The applications and challenges will be presented. The third step is related to speech recognition in the home environment. Automatic Speech Recognition systems (ASR) have reached good performances with close talking microphones (e.g., head-set), but the performances decrease significantly as soon as the microphone is moved away from the mouth of the speaker (e.g., when the microphone is set in the ceiling). This deterioration is due to a broad variety of effects including reverberation and presence of undetermined background noise such as TV radio and, devices. This part will present a system of vocal order recognition in distant speech context. This system was evaluated in a dedicated flat thanks to some experiments. This chapter will then conclude with a discussion on the interest of the speech modality concerning the Ambient Assisted Living

    Diphthong Synthesis using the Three-Dimensional Dynamic Digital Waveguide Mesh

    Get PDF
    The human voice is a complex and nuanced instrument, and despite many years of research, no system is yet capable of producing natural-sounding synthetic speech. This affects intelligibility for some groups of listeners, in applications such as automated announcements and screen readers. Furthermore, those who require a computer to speak - due to surgery or a degenerative disease - are limited to unnatural-sounding voices that lack expressive control and may not match the user's gender, age or accent. It is evident that natural, personalised and controllable synthetic speech systems are required. A three-dimensional digital waveguide model of the vocal tract, based on magnetic resonance imaging data, is proposed here in order to address these issues. The model uses a heterogeneous digital waveguide mesh method to represent the vocal tract airway and surrounding tissues, facilitating dynamic movement and hence speech output. The accuracy of the method is validated by comparison with audio recordings of natural speech, and perceptual tests are performed which confirm that the proposed model sounds significantly more natural than simpler digital waveguide mesh vocal tract models. Control of such a model is also considered, and a proof-of-concept study is presented using a deep neural network to control the parameters of a two-dimensional vocal tract model, resulting in intelligible speech output and paving the way for extension of the control system to the proposed three-dimensional vocal tract model. Future improvements to the system are also discussed in detail. This project considers both the naturalness and control issues associated with synthetic speech and therefore represents a significant step towards improved synthetic speech for use across society

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

    Phonetically transparent technique for the automatic transcription of speech

    Get PDF

    Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    Get PDF
    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

    Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    Get PDF
    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this book is substantially improved in terms of accuracy, flexibility, and robustness

    Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

    No full text
    Partially observable Markov decision processes are interesting because of their ability to model most conceivable real-world learning problems, for example, robot navigation, driving a car, speech recognition, stock trading, and playing games. The downside of this generality is that exact algorithms are computationally intractable. Such computational complexity motivates approximate approaches. One such class of algorithms are the so-called policy-gradient methods from reinforcement learning. They seek to adjust the parameters of an agent in the direction that maximises the long-term average of a reward signal. Policy-gradient methods are attractive as a \emph{scalable} approach for controlling partially observable Markov decision processes (POMDPs). In the most general case POMDP policies require some form of internal state, or memory, in order to act optimally. Policy-gradient methods have shown promise for problems admitting memory-less policies but have been less successful when memory is required. This thesis develops several improved algorithms for learning policies with memory in an infinite-horizon setting. Directly, when the dynamics of the world are known, and via Monte-Carlo methods otherwise. The algorithms simultaneously learn how to act and what to remember. ..

    Policy-Gradient Algorithms for Partially Observable Markov Decision Processes

    No full text
    Partially observable Markov decision processes are interesting because of their ability to model most conceivable real-world learning problems, for example, robot navigation, driving a car, speech recognition, stock trading, and playing games. The downside of this generality is that exact algorithms are computationally intractable. Such computational complexity motivates approximate approaches. One such class of algorithms are the so-called policy-gradient methods from reinforcement learning. They seek to adjust the parameters of an agent in the direction that maximises the long-term average of a reward signal. Policy-gradient methods are attractive as a \emph{scalable} approach for controlling partially observable Markov decision processes (POMDPs). In the most general case POMDP policies require some form of internal state, or memory, in order to act optimally. Policy-gradient methods have shown promise for problems admitting memory-less policies but have been less successful when memory is required. This thesis develops several improved algorithms for learning policies with memory in an infinite-horizon setting. Directly, when the dynamics of the world are known, and via Monte-Carlo methods otherwise. The algorithms simultaneously learn how to act and what to remember. ..

    Shared control for navigation and balance of a dynamically stable robot.

    Get PDF
    by Law Kwok Ho Cedric.Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.Includes bibliographical references (leaves 106-112).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Related work --- p.4Chapter 1.3 --- Thesis overview --- p.5Chapter 2 --- Single wheel robot: Gyrover --- p.9Chapter 2.1 --- Background --- p.9Chapter 2.2 --- Robot concept --- p.11Chapter 2.3 --- System description --- p.14Chapter 2.4 --- Flywheel characteristics --- p.16Chapter 2.5 --- Control patterns --- p.20Chapter 3 --- Learning Control --- p.22Chapter 3.1 --- Motivation --- p.22Chapter 3.2 --- Cascade Neural Network with Kalman filtering --- p.24Chapter 3.3 --- Learning architecture --- p.27Chapter 3.4 --- Input space --- p.29Chapter 3.5 --- Model evaluation --- p.30Chapter 3.6 --- Training procedures --- p.35Chapter 4 --- Control Architecture --- p.38Chapter 4.1 --- Behavior-based approach --- p.38Chapter 4.1.1 --- Concept and applications --- p.39Chapter 4.1.2 --- Levels of competence --- p.44Chapter 4.2 --- Behavior-based control of Gyrover: architecture --- p.45Chapter 4.3 --- Behavior-based control of Gyrover: case studies --- p.50Chapter 4.3.1 --- Vertical balancing --- p.51Chapter 4.3.2 --- Tiltup motion --- p.52Chapter 4.4 --- Discussions --- p.53Chapter 5 --- Implement ation of Learning Control --- p.57Chapter 5.1 --- Validation --- p.57Chapter 5.1.1 --- Vertical balancing --- p.58Chapter 5.1.2 --- Tilt-up motion --- p.62Chapter 5.1.3 --- Discussions --- p.62Chapter 5.2 --- Implementation --- p.65Chapter 5.2.1 --- Vertical balanced motion --- p.65Chapter 5.2.2 --- Tilt-up motion --- p.68Chapter 5.3 --- Combined motion --- p.70Chapter 5.4 --- Discussions --- p.72Chapter 6 --- Shared Control --- p.74Chapter 6.1 --- Concept --- p.74Chapter 6.2 --- Schemes --- p.78Chapter 6.2.1 --- Switch mode --- p.79Chapter 6.2.2 --- Distributed mode --- p.79Chapter 6.2.3 --- Combined mode --- p.80Chapter 6.3 --- Shared control of Gyrover --- p.81Chapter 6.4 --- How to share --- p.83Chapter 6.5 --- Experimental study --- p.88Chapter 6.5.1 --- Heading control --- p.89Chapter 6.5.2 --- Straight path --- p.90Chapter 6.5.3 --- Circular path --- p.91Chapter 6.5.4 --- Point-to-point navigation --- p.94Chapter 6.6 --- Discussions --- p.95Chapter 7 --- Conclusion --- p.103Chapter 7.1 --- Contributions --- p.103Chapter 7.2 --- Future work --- p.10
    • …
    corecore