212 research outputs found

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    A survey of the application of soft computing to investment and financial trading

    Get PDF

    Sound-production Related Cognitive Tasks for Onset Detection in Self-Paced Brain-Computer Interfaces

    Get PDF
    Objective. The main goal of this research is proposing a novel method of onset detection for Self-Paced (SP) Brain-Computer Interfaces (BCIs) to increase usability and practicality of BCIs towards real-world uses from laboratory research settings. Approach. To achieve this goal, various Sound-Production Related Cognitive Tasks (SPRCTs) were tested against idle state in offline and simulated-online experiments. An online experiment was then conducted that turned a messenger dialogue on when a new message arrived by executing the Sound Imagery (SI) onset detection task in real-life scenarios (e.g. watching video, reading text). The SI task was chosen as an onset task because of its advantages over other tasks: 1) Intuitiveness. 2) Beneficial for people with motor disabilities. 3) No significant overlap with other common, spontaneous cognitive states becoming easier to use in daily-life situations. 4) No dependence on user’s mother language. Main results. The final online experimental results showed the new SI onset task had significantly better performance than the Motor Imagery (MI) approach. 84.04% (SI) vs 66.79% (MI) TFP score for sliding image scenario, 80.84% vs 61.07% for watching video task. Furthermore, the onset response speed showed the SI task being significantly faster than MI. In terms of usability, 75% of subjects answered SI was easier to use. Significance. The new SPRCT outperforms typical MI for SP onset detection BCIs (significantly better performance, faster onset response and easier usability), therefore it would be more easily used in daily-life situations. Another contribution of this thesis is a novel EMG artefact-contaminated EEG channel selection and handling method that showed significant class separation improvement against typical blind source separation techniques. A new performance evaluation metric for SP BCIs, called true-false positive score was also proposed as a standardised performance assessment method that considers idle period length, which was not considered in other typical metrics

    Leveraging EEG-based speech imagery brain-computer interfaces

    Get PDF
    Speech Imagery Brain-Computer Interfaces (BCIs) provide an intuitive and flexible way of interaction via brain activity recorded during imagined speech. Imagined speech can be decoded in form of syllables or words and captured even with non-invasive measurement methods as for example the Electroencephalography (EEG). Over the last decade, research in this field has made tremendous progress and prototypical implementations of EEG-based Speech Imagery BCIs are numerous. However, most work is still conducted in controlled laboratory environments with offline classification and does not find its way to real online scenarios. Within this thesis we identify three main reasons for these circumstances, namely, the mentally and physically exhausting training procedures, insufficient classification accuracies and cumbersome EEG setups with usually high-resolution headsets. We furthermore elaborate on possible solutions to overcome the aforementioned problems and present and evaluate new methods in each of the domains. In detail we introduce two new training concepts for imagined speech BCIs, one based on EEG activity during silently reading and the other recorded during overtly speaking certain words. Insufficient classification accuracies are addressed by introducing the concept of a Semantic Speech Imagery BCI, which classifies the semantic category of an imagined word prior to the word itself to increase the performance of the system. Finally, we investigate on different techniques for electrode reduction in Speech Imagery BCIs and aim at finding a suitable subset of electrodes for EEG-based imagined speech detection, therefore facilitating the cumbersome setups. All of our presented results together with general remarks on experiences and best practice for study setups concerning imagined speech are summarized and supposed to act as guidelines for further research in the field, thereby leveraging Speech Imagery BCIs towards real-world application.Speech Imagery Brain-Computer Interfaces (BCIs) bieten eine intuitive und flexible Möglichkeit der Interaktion mittels Gehirnaktivität, aufgezeichnet während der bloßen Vorstellung von Sprache. Vorgestellte Sprache kann in Form von Silben oder Wörtern auch mit nicht-invasiven Messmethoden wie der Elektroenzephalographie (EEG) gemessen und entschlüsselt werden. In den letzten zehn Jahren hat die Forschung auf diesem Gebiet enorme Fortschritte gemacht, und es gibt zahlreiche prototypische Implementierungen von EEG-basierten Speech Imagery BCIs. Die meisten Arbeiten werden jedoch immer noch in kontrollierten Laborumgebungen mit Offline-Klassifizierung durchgeführt und finden nicht denWeg in reale Online-Szenarien. In dieser Arbeit identifizieren wir drei Hauptgründe für diesen Umstand, nämlich die geistig und körperlich anstrengenden Trainingsverfahren, unzureichende Klassifizierungsgenauigkeiten und umständliche EEG-Setups mit meist hochauflösenden Headsets. Darüber hinaus erarbeiten wir mögliche Lösungen zur Überwindung der oben genannten Probleme und präsentieren und evaluieren neue Methoden für jeden dieser Bereiche. Im Einzelnen stellen wir zwei neue Trainingskonzepte für Speech Imagery BCIs vor, von denen eines auf der Messung von EEG-Aktivität während des stillen Lesens und das andere auf der Aktivität während des Aussprechens bestimmter Wörter basiert. Unzureichende Klassifizierungsgenauigkeiten werden durch die Einführung des Konzepts eines Semantic Speech Imagery BCI angegangen, das die semantische Kategorie eines vorgestellten Wortes vor dem Wort selbst klassifiziert, um die Performance des Systems zu erhöhen. Schließlich untersuchen wir verschiedene Techniken zur Elektrodenreduktion bei Speech Imagery BCIs und zielen darauf ab, eine geeignete Teilmenge von Elektroden für die EEG-basierte Erkennung von vorgestellter Sprache zu finden, um so die umständlichen Setups zu erleichtern. Alle unsere Ergebnisse werden zusammen mit allgemeinen Bemerkungen zu Erfahrungen und Best Practices für Studien-Setups bezüglich vorgestellter Sprache zusammengefasst und sollen als Richtlinien für die weitere Forschung auf diesem Gebiet dienen, um so Speech Imagery BCIs für die Anwendung in der realenWelt zu optimieren

    The Future of Humanoid Robots

    Get PDF
    This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Affective Computing

    Get PDF
    This book provides an overview of state of the art research in Affective Computing. It presents new ideas, original results and practical experiences in this increasingly important research field. The book consists of 23 chapters categorized into four sections. Since one of the most important means of human communication is facial expression, the first section of this book (Chapters 1 to 7) presents a research on synthesis and recognition of facial expressions. Given that we not only use the face but also body movements to express ourselves, in the second section (Chapters 8 to 11) we present a research on perception and generation of emotional expressions by using full-body motions. The third section of the book (Chapters 12 to 16) presents computational models on emotion, as well as findings from neuroscience research. In the last section of the book (Chapters 17 to 22) we present applications related to affective computing

    Patient centric intervention for children with high functioning autism spectrum disorder. Can ICT solutions improve the state of the art ?

    Get PDF
    In my PhD research we developed an integrated technological platform for the acquisition of neurophysiologic signals in a semi-naturalistic setting where children are free to move around, play with different objects and interact with the examiner. The interaction with the examiner rather than with a screen is another very important feature of the present research, and allows recreating a more real situation with social interactions and cues. In this paradigm, we can assume that the signals acquired from the brain and the autonomic system, are much more similar to what is generated while the child interacts in common life situations. This setting, with a relatively simple technical implementation, can be considered as one step towards a more behaviorally driven analysis of neurophysiologic activity. Within the context of a pilot open trial, we showed the feasibility of the technological platform applied to the classical intervention solutions for the autism. We found that (1) the platform was useful during both children-therapist interaction at hospital as well as children-parents interaction at home, (2) tailored intervention was compatible with at home use and non-professional therapist/parents. Going back to the title of my thesis: 'Can ICT solution improve the state-of-the-art ?' the answer could be: 'Yes it can be an useful support for a skilled professional in the field of autis
    • …
    corecore