7 research outputs found

    Towards End-to-End spoken intent recognition in smart home

    Get PDF
    International audienceVoice based interaction in a smart home has become a feature of many industrial products. These systems react to voice commands, whether it is for answering a question, providing music or turning on the lights. To be efficient, these systems must be able to extract the intent of the user from the voice command. Intent recognition from voice is typically performed through automatic speech recognition (ASR) and intent classification from the transcriptions in a pipeline. However, the errors accumulated at the ASR stage might severely impact the intent classifier. In this paper, we propose an End-to-End (E2E) model to perform intent classification directly from the raw speech input. The E2E approach is thus optimized for this specific task and avoids error propagation. Furthermore, prosodic aspects of the speech signal can be exploited by the E2E model for intent classification (e.g., question vs imperative voice). Experiments on a corpus of voice commands acquired in a real smart home reveal that the state-of-the art pipeline baseline is still superior to the E2E approach. However, using artificial data generation techniques we show that significant improvement to the E2E model can be brought to reach competitive performances. This opens the way to further research on E2E Spoken Language Understanding

    SLU FOR VOICE COMMAND IN SMART HOME: COMPARISON OF PIPELINE AND END-TO-END APPROACHES

    Get PDF
    International audienceSpoken Language Understanding (SLU) is typically performedthrough automatic speech recognition (ASR) andnatural language understanding (NLU) in a pipeline. However,errors at the ASR stage have a negative impact on theNLU performance. Hence, there is a rising interest in End-to-End (E2E) SLU to jointly perform ASR and NLU. AlthoughE2E models have shown superior performance to modularapproaches in many NLP tasks, current SLU E2E modelshave still not definitely superseded pipeline approaches.In this paper, we present a comparison of the pipelineand E2E approaches for the task of voice command in smarthomes. Since there are no large non-English domain-specificdata sets available, although needed for an E2E model, wetackle the lack of such data by combining Natural LanguageGeneration (NLG) and text-to-speech (TTS) to generateFrench training data. The trained models were evaluatedon voice commands acquired in a real smart home with severalspeakers. Results show that the E2E approach can reachperformances similar to a state-of-the art pipeline SLU despitea higher WER than the pipeline approach. Furthermore,the E2E model can benefit from artificially generated data toexhibit lower Concept Error Rates than the pipeline baselinefor slot recognition

    State of the art on ethical, legal, and social issues linked to audio- and video-based AAL solutions - Uploaded on December 29, 2021

    Full text link
    Ambient assisted living (AAL) technologies are increasingly presented and sold as essential smart additions to daily life and home environments that will radically transform the healthcare and wellness markets of the future. An ethical approach and a thorough understanding of all ethics in surveillance/monitoring architectures are therefore pressing. AAL poses many ethical challenges raising questions that will affect immediate acceptance and long-term usage. Furthermore, ethical issues emerge from social inequalities and their potential exacerbation by AAL, accentuating the existing access gap between high-income countries (HIC) and low and middle-income countries (LMIC). Legal aspects mainly refer to the adherence to existing legal frameworks and cover issues related to product safety, data protection, cybersecurity, intellectual property, and access to data by public, private, and government bodies. Successful privacy-friendly AAL applications are needed, as the pressure to bring Internet of Things (IoT) devices and ones equipped with artificial intelligence (AI) quickly to market cannot overlook the fact that the environments in which AAL will operate are mostly private (e.g., the home). The social issues focus on the impact of AAL technologies before and after their adoption. Future AAL technologies need to consider all aspects of equality such as gender, race, age and social disadvantages and avoid increasing loneliness and isolation among, e.g. older and frail people. Finally, the current power asymmetries between the target and general populations should not be underestimated nor should the discrepant needs and motivations of the target group and those developing and deploying AAL systems. Whilst AAL technologies provide promising solutions for the health and social care challenges, they are not exempt from ethical, legal and social issues (ELSI). A set of ELSI guidelines is needed to integrate these factors at the research and development stage

    State of the art on ethical, legal, and social issues linked to audio- and videobased AAL solutions

    Get PDF
    Working Group 1. Social responsibility: Ethical, legal, social, data protection and privacy issuesAbstract Ambient assisted living (AAL) technologies are increasingly presented and sold as essential smart additions to daily life and home environments that will radically transform the healthcare and wellness markets of the future. An ethical approach and a thorough understanding of all ethics in surveillance/monitoring architectures are therefore pressing. AAL poses many ethical challenges raising questions that will affect immediate acceptance and long-term usage. Furthermore, ethical issues emerge from social inequalities and their potential exacerbation by AAL, accentuating the existing access gap between high-income countries (HIC) and low and middle-income countries (LMIC). Legal aspects mainly refer to the adherence to existing legal frameworks and cover issues related to product safety, data protection, cybersecurity, intellectual property, and access to data by public, private, and government bodies. Successful privacy-friendly AAL applications are needed, as the pressure to bring Internet of Things (IoT) devices and ones equipped with artificial intelligence (AI) quickly to market cannot overlook the fact that the environments in which AAL will operate are mostly private (e.g., the home). The social issues focus on the impact of AAL technologies before and after their adoption. Future AAL technologies need to consider all aspects of equality such as gender, race, age and social disadvantages and avoid increasing loneliness and isolation among, e.g. older and frail people. Finally, the current power asymmetries between the target and general populations should not be underestimated nor should the discrepant needs and motivations of the target group and those developing and deploying AAL systems. Whilst AAL technologies provide promising solutions for the health and social care challenges, they are not exempt from ethical, legal and social issues (ELSI). A set of ELSI guidelines is needed to integrate these factors at the research and development stage. Keywords Ethical principles, Privacy, Assistive Living Technologies, Privacy by Design, General Data Protection Regulation.publishedVersio

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    State of the Art of Audio- and Video-Based Solutions for AAL

    Get PDF
    It is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters. Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals. Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely lifelogging and self-monitoring, remote monitoring of vital signs, emotional state recognition, food intake monitoring, activity and behaviour recognition, activity and personal assistance, gesture recognition, fall detection and prevention, mobility assessment and frailty recognition, and cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed

    Context-Aware Voice-based Interaction in Smart Home -VocADom@A4H Corpus Collection and Empirical Assessment of its Usefulness

    Get PDF
    International audienceSmart homes aim at enhancing the quality of life of people at home by the use of home automation systems and Ambient Intelligence. Most of these smart homes provide enhanced interaction by relying on context-aware systems learned on data. Whereas voice-based interaction is the current emerging trend, most available corpora are either concerned only with home automation sensors or only with audio technology, which limits the development of context-aware voice-based systems. This paper presents the VocADom@A4H corpus, which is a dataset composed of users’ interactions recorded in a fully equipped Smart Home. About 12 hours of multichannel distant speech signal synchronized with logs of an openHAB home automation system were collected from 11 participants who performed activities of daily living with the presence of real-life noises, such as other persons speaking, use of vacuum cleaner, TV, etc. This corpus can serve as a valuable material for studies in pervasive intelligence, such as human tracking, human activity recognition, context aware interaction, and robust distant speech processing in the home. Experiments performed on multichannel speech and home automation sensors data for robust voice activity detection and multiresident localization show the potential of the corpus to support the development of context-aware smart home systems
    corecore