18 research outputs found

    TASE: Task-Aware Speech Enhancement for Wake-Up Word Detection in Voice Assistants

    Get PDF
    Wake-up word spotting in noisy environments is a critical task for an excellent user experience with voice assistants. Unwanted activation of the device is often due to the presence of noises coming from background conversations, TVs, or other domestic appliances. In this work, we propose the use of a speech enhancement convolutional autoencoder, coupled with on-device keyword spotting, aimed at improving the trigger word detection in noisy environments. The end-to-end system learns by optimizing a linear combination of losses: a reconstruction-based loss, both at the log-mel spectrogram and at the waveform level, as well as a specific task loss that accounts for the cross-entropy error reported along the keyword spotting detection. We experiment with several neural network classifiers and report that deeply coupling the speech enhancement together with a wake-up word detector, e.g., by jointly training them, significantly improves the performance in the noisiest conditions. Additionally, we introduce a new publicly available speech database recorded for the TelefĂłnica's voice assistant, Aura. The OK Aura Wake-up Word Dataset incorporates rich metadata, such as speaker demographics or room conditions, and comprises hard negative examples that were studiously selected to present different levels of phonetic similarity with respect to the trigger words 'OK Aura'. Keywords: speech enhancement; wake-up word; keyword spotting; deep learning; convolutional neural networ

    Motion artifact reduction in PPG signals

    Get PDF
    The aim of this thesis was to investigate methods for artifact removal in PPG signals and to implement and evaluate a few existing algorithms claiming that the amplitude information is recovered when removing motion artifacts from photoplethysmographic signals (PPG) captured from pulse oximeters. We developed a new proposed method that uses a two-stage based approach with singular value decomposition and fixed fast ICA algorithm in order to generate a PPG-correlated reference signal that is used in adaptive noise cancellation. The results were promising and our proposed method is easy to implement and converges quickly with good extraction performance. It has a few design parameters and only needs the estimated period of the PPG signal. Our method could be used in a clinical routine for prediction of intradialytic hypotension. However it should be mentioned that although our method has great potential the simulations were only conducted on two healthy males. Further studies on a larger dataset might be needed in order to establish a full value of the efficacy of our method.Felaktiga mätresultat vid användning av pulsoximeter under patientövervakning En felaktig diagnos är ju inget kul att få av sin läkare. I sjukhusmiljö samt kliniska omgivningar eller under akuttransport kan pulsoximetern, som bland annat mäter syremättnaden i blodet via fingret, ge felaktiga mätresultat på grund av frivilliga eller ofrivilliga rörelser hos patienten. Under de senaste åren har biomedicinsk teknologi ökat drastiskt för mer effektiva behandlingar samt tillförlitliga diagnoser. För att få kliniskt korrekta mätningar från medicinsk utrustning måste dessa apparater vara optimerade på bästa sätt. Detta kommer att underlätta för sjukvårdspersonalen att dra korrekta slutsatser vid beslut under patient övervakning. En patient med t.ex. njursvikt får problem med rening av restprodukter och avlägsnandet av vatten från blodet, vilket är njurarnas uppgift i huvudsak. Vid hemodialys behandling pumpas blodet ut ur kroppen via nålar för att därefter renas i en dialysator som ska ersätta njurarnas funktion. En vanlig biverkning till följd av behandlingen är blodtrycksfall (intradialytisk hypotoni) vilket sker i 25% av alla behandlingar. Resultat från tidigare forskning visar att man kan prediktera blodtrycksfall i samband med hemodialys behandling med hjälp av amplituden hos fotopletysmografi (PPG) signal. PPG signalen fås av pulseoximetern som har en klämma man kan fästa på fingertoppen. Genom att ljus av två våglängder passerar huden kan man med hjälp av absorptionen i blodet avläsa syremättnad och hjärtpuls. Problemet med PPG signalen är att om patienten rör sig påverkar detta amplituden. Därför är det viktigt att ta bort effekten av rörelser på ett sådant sätt att amplitudinformationen är bevarad. Vi undersökte metoder för borttagning av dessa effekter från rörelser hos patienten och föreslog en ny metod som på ett effektivt sätt estimerar en ren PPG signal med amplitudinformationen bevarad. Metoden har ett fåtal designparameterar och konvergerar snabbt mot lösningen. Vår metod skulle kunna användas i en klinisk rutin för prediktering av intradialytisk hypotoni i samband med hemodialys behandling. Det bör dock nämnas att vår studie utfördes på två friska testpersoner och att mer data hade krävts för en fullskalig utvärdering av metoden

    Multimedia sensors embedded in smartphones for ambient assisted living and e-health

    Full text link
    The final publication is available at link.springer.com[EN] Nowadays, it is widely extended the use of smartphones to make human life more comfortable. Moreover, there is a special interest on Ambient Assisted Living (AAL) and e-Health applications. The sensor technology is growing and amount of embedded sensors in the smartphones can be very useful for AAL and e-Health. While some sensors like the accelerometer, gyroscope or light sensor are very used in applications such as motion detection or light meter, there are other ones, like the microphone and camera which can be used as multimedia sensors. This paper reviews the published papers focused on showing proposals, designs and deployments of that make use of multimedia sensors for AAL and e-health. We have classified them as a function of their main use. They are the sound gathered by the microphone and image recorded by the camera. We also include a comparative table and analyze the gathered information.Parra-Boronat, L.; Sendra, S.; Jimenez, JM.; Lloret, J. (2016). Multimedia sensors embedded in smartphones for ambient assisted living and e-health. Multimedia Tools and Applications. 75(21):13271-13297. doi:10.1007/s11042-015-2745-8S13271132977521Acampora G, Cook DJ, Rashidi P, Vasilakos AV (2013) A survey on ambient intelligence in healthcare. Proc IEEE 101(12):2470–2494Al-Attas R, Yassine A, Shirmohammadi S (2012) Tele-Medical Applications in Home-Based Health Care. In proceeding of the 2012 I.E. International Conference on Multimedia and Expo Workshops (ICMEW 2012). Jul. 9–13, 2012. Melbourne, Australia. (pp. 441–446)Alemdar H, Ersoy C (2010) Wireless sensor networks for healthcare: a survey. Comput Netw 54(15):2688–2710Alqassim S, Ganesh M, Khoja S, Zaidi M, Aloul F, Sagahyroon A (2012) Sleep apnea monitoring using mobile phones. In proceedings of the 14th International Conference on e-Health Networking, Applications and Services (Healthcom 2012). Oct. 10 – 13, 2012. Beijing, China. (pp. 443–446)Anderson G, Horvath J (2004) The growing burden of chronic disease in America. Public Health Rep 119(3):263–270Aquilano M, Cavallo F, Bonaccorsi M, Esposito R, Rovini E, Filippi M, Carrozza MC (2012) Ambient assisted living and ageing: Preliminary results of RITA project. In proceedings of 34th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2012), Aug. 28-Sept. 1, 2012. San Diego USA. (pp. 5823–5826)Bellini P, Bruno I, Cenni D, Fuzier A, Nesi P, Paolucci M (2012) Mobile Medicine: semantic computing management for health care applications on desktop and mobile devices. Multimed Tools Appl 58(1):41–79Boulos MN, Wheeler S, Tavares C, Jones R (2011) How smartphones are changing the face of mobile and participatory healthcare: an overview, with example from eCAALYX. Biomed Eng Online 10(1):24Bourouis A, Feham M, Hossain MA, Zhang L (2014) An intelligent mobile based decision support system for retinal disease diagnosis. Decis Support Syst 59(2014):341–350Bourouis A, Zerdazi A, Feham M, Bouchachia A (2013) M-health: skin disease analysis system using Smartphone’s camera. Procedia Comput Sci 19(2013):1116–1120M.W. Brault, (2010). Americans With Disabilities: 2010. Household Economic Studies. In United States Census Bureau website. Available at: www.census.gov/prod/2012pubs/p70-131.pdf Last Access 16 Dec 2014Breath Counter App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.softrove.app.bc Last Access 30 Nov 2014Cardinaux F, Bhowmik D, Abhayaratne C, Hawley MS (2011) Video based technology for ambient assisted living: a review of the literature. J Ambient Intell Smart Environ 3(3):253–269Cardiograph App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.macropinch.hydra.android . Last Access 30 Nov 2014Chaaraoui AA, Climent-Pérez P, Flórez-Revuelta F (2012) A review on vision techniques applied to human behaviour analysis for ambient-assisted living. Expert Syst Appl 39(12):10873–10888Chen NC, Wang KC, Chu HH (2012) Listen-to-nose: a low-cost system to record nasal symptoms in daily life. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UBIComp 2012). Sep. 05–08, 2012. Pittsburgh, USA. (pp. 590–591)Chiarini G, Ray P, Akter S, Masella C, Ganz A (2013) mHealth technologies for chronic diseases and elders: a systematic review. IEEE J Sel Areas Commun 31(9):6–18Color Detector App In Google Play website. Available at: //play.google.com/store/apps/details?id = com.mobialia.colordetector. Last Access 30 Nov 2014Colorblind Assitant App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.unclechromedome.colorblindassistant . Last Access 30 Nov 2014Dale O, Solheim I, Halbach T, Schulz T, Spiru L, Turcu I (2013) What seniors want in a mobile Help-On-Demand service. In proceedings of the Fifth International Conference on eHealth, Telemedicine, and Social Medicine (eTELEMED 2013). Feb. 24 – Mar. 1, 2013. Nice, France. (pp. 96–101)Estepa AJ, Estepa R, Vozmediano J, Carrillo P (2014) Dynamic VoIP codec selection on smartphones. Netw Protoc Algoritm 6(2):22–37Falk TH, Maier M (2013) Context awareness in WBANs: a survey on medical and non-medical applications. IEEE Wirel Commun 20(4):30–37Franco C, Fleury A, Guméry PY, Diot B, Demongeot J, Vuillerme N (2013) iBalance-ABF: a smartphone-based audio-biofeedback balance system. IEEE Trans Biomed Eng 60(1):211–215García M, Lloret J, Bellver I, Tomás J (2013) Intelligent IPTV Distribution for Smart Phones (Book Chapter 13). In Intelligent Multimedia Technologies for Networking Applications. IGI GlobalGregoski MJ, Mueller M, Vertegel A, Shaporev A, Jackson BB, Frenzel RM, Treiber FA (2012) Development and validation of a smartphone heart rate acquisition application for health promotion and wellness telehealth applications. Int J Telemed Appl 2012, 1. Article ID 696324Grimaldi D, Kurylyak Y, Lamonaca F, Nastro A (2011) Photoplethysmography detection by smartphone’s videocamera. In proceedings of the 6th International Conference on Intelligent Data Acquisition and Advanced Computing Systems (IEEE IDAACS 2011), Sep. 15–17, 2011. Prague, Czech Republic. (Vol. 1, pp. 488–491)Gurrin C, Qiu Z, Hughes M, Caprani N, Doherty AR, Hodges SE, Smeaton AF (2013) The smartphone as a platform for wearable cameras in health research. Am J Prev Med 44(3):308–313Haché G, Lemaire ED, Baddour N (2011) Wearable mobility monitoring using a multimedia smartphone platform. IEEE Trans Instrum Meas 60(9):3153–3161Heathers JA (2013) Smartphone-enabled pulse rate variability: an alternative methodology for the collection of heart rate variability in psychophysiological research. Int J Psychophysiol 89(3):297–304Hoseini-Tabatabaei SA, Gluhak A, Tafazolli R (2013) A survey on smartphone-based systems for opportunistic user context recognition. ACM Comput Surv (CSUR) 45(3):1–51, Paper No. 27Illiger K, Hupka M, von Jan U, Wichelhaus D, Albrecht UV (2014) Mobile technologies: expectancy, usage, and acceptance of clinical staff and patients at a University Medical Center. JMIR mHealth uHealth 2(4), e42Kanjo E (2012) Tools and architectural support for mobile phones based crowd control systems. Netw Protoc Algoritm 4(3):4–14Kawano Y, Yanai K (2014) FoodCam: a real-time food recognition system on a smartphone. Multimedia Tools and Applications,Published online:April 2014: 1–25Khan FH, Khan ZH (2010) A systematic approach for developing mobile information system based on location based services. Netw Protoc Algoritm 2(2):54–65Kochanov D, Jonas S, Hamadeh N, Yalvac E, Slijp H, Deserno TM (2014) Urban Positioning Using Smartphone-Based Imaging. In Bildverarbeitung für die Medizin, 2014: 186–191Kurniawan S (2008) Older people and mobile phones: a multi-method investigation. Int J Human-Comput Stud 66(12):889–901Lacuesta R, Lloret J, Sendra S, Peñalver L (2014) Spontaneous Ad Hoc mobile cloud computing network. Sci World J 2014:1–19Lakens D (2013) Using a Smartphone to measure heart rate changes during relived happiness and anger. IEEE Trans Affect Comput 5(3):217–226Larson EC, Goel M, Boriello G, Heltshe S, Rosenfeld M, Patel SN (2012) Spirosmart: using a microphone to measure lung function on a mobile phone, In proceedings of the 2012 ACM Conference on Ubiquitous Computing (UBIComp 2012). Sep. 05–08, 2012. Pittsburgh, USA. (pp. 280–289)Lee J, Reyes BA, McManus DD, Mathias O, Chon KH (2012) Atrial fibrillation detection using a smart phone. In proceedings of the 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2012). Aug.28-Sep.1, 2012. San Diego, (pp. 1177–1180)Lloret J, Garcia M, Bri D, Diaz JR (2009) A cluster-based architecture to structure the topology of parallel wireless sensor networks. Sensors (Basel) 9(12):10513–10544Lu H, Frauendorfer D, Rabbi M, Mast MS, Chittaranjan GT, Campbell AT, Gatica-Perez D, Choudhury T (2012) StressSense: detecting stress in unconstrained acoustic environments using smartphones. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UBIComp 2012). Sep. 05–08, 2012. Pittsburgh, USA. (pp. 351–360)Macías E, Abdelfatah H, Suárez A, Cánovas A (2011) Full geo-localized mobile video in Android mobile telephones. Netw Protoc Algoritm 3(1):64–81Macias E, Lloret J, Suarez A, Garcia M (2012) Architecture and protocol of a semantic system designed for video tagging with sensor data in mobile devices. Sensors 12(2):2062–2087Macias E, Suarez A, Lloret J (2013) Mobile sensing systems. Sensors 13(12):17292–17321MedCam App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.cupel.MedCam . Last Access 30 Nov 2014Monteiro DM, Rodrigues JJ, Lloret J, Sendra S (2014) A hybrid NFC–Bluetooth secure protocol for Credit Transfer among mobile phones. Secur Commun Netw 7(2):325–337Mosa ASM, Yoo I, Sheets L (2012) A systematic review of healthcare applications for smartphones. BMC Med Inform Decis Mak 12(1):67MyEarDroid App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.tecnalia.health.myeardroid . Last Access 30 Nov 2014O’Grady MJ, Muldoon C, Dragone M, Tynan R, O’Hare GM (2010) Towards evolutionary ambient assisted living systems. J Ambient Intell Humaniz Comput 1(1):15–29Poppe R (2010) A survey on vision-based human action recognition. Image Vis Comput 28(6):976–990Quit Snoring App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.ptech_hm.qs . Last Access 30 Nov 2014Rahman MA, Hossain MS, El Saddik A (2013) Context-aware multimedia services modeling: an e-Health perspective. Multimed Tools Appl 73(3):1147–1176Sendra S, Granell E, Lloret J, Rodrigues JJPC (2014) Smart collaborative mobile system for taking care of disabled and elderly people. Mob Netw Appl 19(3):287–302Smartphone Milestone: Half of Mobile Subscribers Ages 55+ Own Smartphones Mobile. Online report.(April 22,2014). In the Nielsen Company website. Available at: http://www.nielsen.com/us/en/insights/news/2014/smartphone-milestone-half-of-americans-ages-55-own-smartphones.html Last Access 25 Nov 2014Smith A (2013) Smartphone Ownership 2013. On-line Report June 5, 2013. In Pew Research Center’s Internet & American Life Project website. Available at: http://www.pewinternet.org/2013/06/05/smartphone-ownership-2013/ Last Access 25 Nov 2014SnoreClock App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=de.ralphsapps.snorecontrol Last Access 30 Nov 2014Storf H, Kleinberger T, Becker M, Schmitt M, Bomarius F, Prueckner S (2009) An event-driven approach to activity recognition in ambient assisted living. Lect Notes Comput Sci 5859:123–132Su X, Tong H, Ji P (2014) Activity recognition with smartphone sensors. Tsinghua Sci Technol 19(3):235–249Tapu R, Mocanu B, Bursuc A, Zaharia T (2013) A smartphone-based obstacle detection and classification system for assisting visually impaired people. In proceedings of the 2013 I.E. International Conference on Computer Vision Workshops (ICCVW 2013). Dec. 2–8, 2013. Sydney, Australia. (pp. 444–451)The vOICe for Android App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=vOICe.vOICe . Last Access 30 Nov 2014Tudzarov A, Janevski T (2011) Protocols and algorithms for the next generation 5G mobile systems. Netw Protoc Algoritm 3(1):94–114Tyagi A, Miller K, Cockburn M (2012) e-Health tools for targeting and improving melanoma screening: a review. J Skin Cancer 2012, Article ID 437502Voice Cam for Blind App. In Google Play website. Available at: https://play.google.com/store/apps/details?id=com.prod.voice.cam Last Access 30 Nov 2014Wadhawan T, Situ N, Rui H, Lancaster K, Yuan X, Zouridakis G (2011) Implementation of the 7-point checklist for melanoma detection on smart handheld devices. In proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, (EMBC 2011). Aug. 30- Sep 03, 2011. Boston, MA, USA (pp. 3180–3183)Xiong H, Zhang D, Zhang D, Gauthier V (2012) Predicting mobile phone user locations by exploiting collective behavioral patterns. In proceedings of the 9th International Conference on Ubiquitous Intelligence & Computing and 9th International Conference on Autonomic & Trusted Computing (UIC/ATC). 4–7 Sept. 2012. Fukuoka, Japan. (pp. 164–171)Xu X, Shu L, Guizani M, Liu M, Lu J (2014) A survey on energy harvesting and integrated data sharing in wireless body area networks. Int J Distrib Sens Netw. Article ID 438695Yu W, Su X, Hansen J (2012) A smartphone design approach to user communication interface for administering storage system network. Netw Protoc Algoritm 4(4):126–155Zhang D, Vasilakos AV, Xiong H (2012) Predicting location using mobile phone calls. ACM SIGCOMM Comput Commun Rev 42(4):295–296Zhang D, Xiong H, Yang L, Gauither V (2013) NextCell: predicting location using social interplay from cell phone traces. EEE Trans Comput 64(2):452–46

    Acoustic sensing as a novel approach for cardiovascular monitoring at the wrist

    Get PDF
    Cardiovascular diseases are the number one cause of deaths globally. An increased cardiovascular risk can be detected by a regular monitoring of the vital signs including the heart rate, the heart rate variability (HRV) and the blood pressure. For a user to undergo continuous vital sign monitoring, wearable systems prove to be very useful as the device can be integrated into the user's lifestyle without affecting the daily activities. However, the main challenge associated with the monitoring of these cardiovascular parameters is the requirement of different sensing mechanisms at different measurement sites. There is not a single wearable device that can provide sufficient physiological information to track the vital signs from a single site on the body. This thesis proposes a novel concept of using acoustic sensing over the radial artery to extract cardiac parameters for vital sign monitoring. A wearable system consisting of a microphone is designed to allow the detection of the heart sounds together with the pulse wave, an attribute not possible with existing wrist-based sensing methods. Methods: The acoustic signals recorded from the radial artery are a continuous reflection of the instantaneous cardiac activity. These signals are studied and characterised using different algorithms to extract cardiovascular parameters. The validity of the proposed principle is firstly demonstrated using a novel algorithm to extract the heart rate from these signals. The algorithm utilises the power spectral analysis of the acoustic pulse signal to detect the S1 sounds and additionally, the K-means method to remove motion artifacts for an accurate heartbeat detection. The HRV in the short-term acoustic recordings is found by extracting the S1 events using the relative information between the short- and long-term energies of the signal. The S1 events are localised using three different characteristic points and the best representation is found by comparing the instantaneous heart rate profiles. The possibility of measuring the blood pressure using the wearable device is shown by recording the acoustic signal under the influence of external pressure applied on the arterial branch. The temporal and spectral characteristics of the acoustic signal are utilised to extract the feature signals and obtain a relationship with the systolic blood pressure (SBP) and diastolic blood pressure (DBP) respectively. Results: This thesis proposes three different algorithms to find the heart rate, the HRV and the SBP/ DBP readings from the acoustic signals recorded at the wrist. The results obtained by each algorithm are as follows: 1. The heart rate algorithm is validated on a dataset consisting of 12 subjects with a data length of 6 hours. The results demonstrate an accuracy of 98.78%, mean absolute error of 0.28 bpm, limits of agreement between -1.68 and 1.69 bpm, and a correlation coefficient of 0.998 with reference to a state-of-the-art PPG-based commercial device. A high statistical agreement between the heart rate obtained from the acoustic signal and the photoplethysmography (PPG) signal is observed. 2. The HRV algorithm is validated on the short-term acoustic signals of 5-minutes duration recorded from each of the 12 subjects. A comparison is established with the simultaneously recorded electrocardiography (ECG) and PPG signals respectively. The instantaneous heart rate for all the subjects combined together achieves an accuracy of 98.50% and 98.96% with respect to the ECG and PPG signals respectively. The results for the time-domain and frequency-domain HRV parameters also demonstrate high statistical agreement with the ECG and PPG signals respectively. 3. The algorithm proposed for the SBP/ DBP determination is validated on 104 acoustic signals recorded from 40 adult subjects. The experimental outputs when compared with the reference arm- and wrist-based monitors produce a mean error of less than 2 mmHg and a standard deviation of error around 6 mmHg. Based on these results, this thesis shows the potential of this new sensing modality to be used as an alternative, or to complement existing methods, for the continuous monitoring of heart rate and HRV, and spot measurement of the blood pressure at the wrist.Open Acces

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    IberSPEECH 2020: XI Jornadas en TecnologĂ­a del Habla and VII Iberian SLTech

    Get PDF
    IberSPEECH2020 is a two-day event, bringing together the best researchers and practitioners in speech and language technologies in Iberian languages to promote interaction and discussion. The organizing committee has planned a wide variety of scientific and social activities, including technical paper presentations, keynote lectures, presentation of projects, laboratories activities, recent PhD thesis, discussion panels, a round table, and awards to the best thesis and papers. The program of IberSPEECH2020 includes a total of 32 contributions that will be presented distributed among 5 oral sessions, a PhD session, and a projects session. To ensure the quality of all the contributions, each submitted paper was reviewed by three members of the scientific review committee. All the papers in the conference will be accessible through the International Speech Communication Association (ISCA) Online Archive. Paper selection was based on the scores and comments provided by the scientific review committee, which includes 73 researchers from different institutions (mainly from Spain and Portugal, but also from France, Germany, Brazil, Iran, Greece, Hungary, Czech Republic, Ucrania, Slovenia). Furthermore, it is confirmed to publish an extension of selected papers as a special issue of the Journal of Applied Sciences, “IberSPEECH 2020: Speech and Language Technologies for Iberian Languages”, published by MDPI with fully open access. In addition to regular paper sessions, the IberSPEECH2020 scientific program features the following activities: the ALBAYZIN evaluation challenge session.Red Española de Tecnologías del Habla. Universidad de Valladoli

    State of the Art of Audio- and Video-Based Solutions for AAL

    Get PDF
    It is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters. Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals. Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely lifelogging and self-monitoring, remote monitoring of vital signs, emotional state recognition, food intake monitoring, activity and behaviour recognition, activity and personal assistance, gesture recognition, fall detection and prevention, mobility assessment and frailty recognition, and cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed
    corecore