1,439 research outputs found

    Estimating Carotid Pulse and Breathing Rate from Near-infrared Video of the Neck

    Full text link
    Objective: Non-contact physiological measurement is a growing research area that allows capturing vital signs such as heart rate (HR) and breathing rate (BR) comfortably and unobtrusively with remote devices. However, most of the approaches work only in bright environments in which subtle photoplethysmographic and ballistocardiographic signals can be easily analyzed and/or require expensive and custom hardware to perform the measurements. Approach: This work introduces a low-cost method to measure subtle motions associated with the carotid pulse and breathing movement from the neck using near-infrared (NIR) video imaging. A skin reflection model of the neck was established to provide a theoretical foundation for the method. In particular, the method relies on template matching for neck detection, Principal Component Analysis for feature extraction, and Hidden Markov Models for data smoothing. Main Results: We compared the estimated HR and BR measures with ones provided by an FDA-cleared device in a 12-participant laboratory study: the estimates achieved a mean absolute error of 0.36 beats per minute and 0.24 breaths per minute under both bright and dark lighting. Significance: This work advances the possibilities of non-contact physiological measurement in real-life conditions in which environmental illumination is limited and in which the face of the person is not readily available or needs to be protected. Due to the increasing availability of NIR imaging devices, the described methods are readily scalable.Comment: 21 pages, 15 figure

    An In-Vehicle Vision-Based Driver's Drowsiness Detection System

    Get PDF
    [[abstract]]Many traffic accidents have been reported due to driver’s drowsiness/fatigue. Drowsiness degrades driving performance due to the declinations of visibility, situational awareness and decision-making capability. In this study, a vision-based drowsiness detection and warning system is presented, which attempts to bring to the attention of a driver to his/her own potential drowsiness. The information provided by the system can also be utilized by adaptive systems to manage noncritical operations, such as starting a ventilator, spreading fragrance, turning on a radio, and providing entertainment options. In high drowsiness situation, the system may initiate navigation aids and alert others to the drowsiness of the driver. The system estimates the fatigue level of a driver based on his/her facial images acquired by a video camera mounted in the front of the vehicle. There are five major steps involved in the system process: preprocessing, facial feature extraction, face tracking, parameter estimation, and reasoning. In the preprocessing step, the input image is sub-sampled for reducing the image size and in turn the processing time. A lighting compensation process is next applied to the reduced image in order to remove the influences of ambient illumination variations. Afterwards, for each image pixel a number of chrominance values are calculated, which are to be used in the next step for detecting facial features. There are four sub-steps constituting the feature extraction step: skin detection, face localization, eyes and mouth detection, and feature confirmation. To begin, the skin areas are located in the image based on the chrominance values of pixels calculated in the previous step and a predefined skin model. We next search for the face region within the largest skin area. However, the detected face is typically imperfect. Facial feature detection within the imperfect face region is unreliable. We actually look for facial features throughout the entire image. As to the face region, it will later be used to confirm the detected facial features. Once facial features are located, they are tracked over the video sequence until they are missed detecting in a video image. At this moment, the facial feature detection process is revoked again. Although facial feature detection is time consuming, facial feature tracking is fast and reliable. During facial feature tracking, parameters of facial expression, including percentage of eye closure over time, eye blinking frequency, durations of eye closure, gaze and mouth opening, as well as head orientation, are estimated. The estimated parameters are then utilized in the reasoning step to determine the driver’s drowsiness level. A fuzzy integral technique is employed, which integrates various types of parameter values to arrive at a decision about the drowsiness level of the driver. A number of video sequences of different drivers and illumination conditions have been tested. The results revealed that our system can work reasonably in daytime. We may extend the system in the future work to apply in nighttime. For this, infrared sensors should be included.

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Robust Visual Heart Rate Estimation

    Get PDF
    Je pƙedstavena novĂĄ metoda odhadu srdečnĂ­ frekvence, HR-CNN - dvoustupƈovĂĄ konvolučnĂ­ neuronovĂĄ sĂ­Ć„. SĂ­Ć„ je trĂ©novĂĄna end-to-end alternujĂ­cĂ­ optimalizacĂ­ a je robustnĂ­ vƯči změnĂĄm osvětlenĂ­ a relativnĂ­mu pohybu snĂ­manĂ©ho objektu a kamery. SĂ­Ć„ funguje dobƙe s nepƙesně registrovanĂœm obličejem z komerčnĂ­ho obličejovĂ©ho detektoru. Z rozsĂĄhlĂ©ho rozboru relevantnĂ­ch zdrojĆŻ vyplĂœvajĂ­ klíčovĂ© faktory omezujĂ­cĂ­ pƙesnost a reprodukovatelnost metod jako: (i) nedostatek veƙejně dostupnĂœch datovĂœch sad a nedostatečně popsanĂ© experimenty v publikovanĂœch člĂĄncĂ­ch, (ii) pouĆŸitĂ­ nespolehlivĂ©ho pulznĂ­ho oximetru pro referenčnĂ­ ground-truth, (iii) chybějĂ­cĂ­ standardnĂ­ experimentĂĄlnĂ­ protokoly. Je pƙedstavena novĂĄ veƙejně dostupnĂĄ datovĂĄ sada ECG-Fitness, kterĂĄ obsahuje 205 minutovĂœch videĂ­, v nichĆŸ 17 dobrovolnĂ­kĆŻ cvičí na posilovacĂ­ch strojĂ­ch. DobrovolnĂ­ci provĂĄdĂ­ celkem 4 aktivity (rozhovor, veslovĂĄnĂ­, cvičenĂ­ na stepperu a na rotopedu). KaĆŸdĂĄ aktivita je zachycena dvěma RGB kamerami, z nichĆŸ jedna je pƙipevněna k prĂĄvě pouĆŸĂ­vanĂ©mu posilovacĂ­mu stroji, kterĂœ vĂœrazně vibruje, a druhĂĄ je uchycena na samostatně stojĂ­cĂ­m stativu. Aktivity "veslovĂĄnĂ­" a "rozhovor" opakujĂ­ dobrovolnĂ­ci dvakrĂĄt. Pƙi druhĂ©m opakovĂĄnĂ­ jsou osvětleni halogenovou lampou. 4 dobrovolnĂ­ci jsou osvětleni LED světlem ve vĆĄech ĆĄesti videĂ­ch. HR-CNN mĂĄ o vĂ­ce jak polovinu lepĆĄĂ­ vĂœsledky neĆŸ dosud publikovanĂ© metody. KaĆŸdĂĄ aktivita v ECG-Fitness datasetu pƙedstavuje jinou kombinaci realistickĂœch vĂœzev. HR-CNN mĂĄ nejlepĆĄĂ­ vĂœsledky v pƙípadě aktivity "veslovĂĄnĂ­" s prĆŻměrnou absolutnĂ­ chybou 3.94 a nejhorĆĄĂ­ v pƙípadě aktivity "rozhovor" s prĆŻměrnou absolutnĂ­ chybou 15.57.A novel heart rate estimator, HR-CNN - a two-step convolutional neural network, is presented. The network is trained end-to-end by alternating optimization to be robust to illumination changes and relative movement of the subject and the camera. The network works well with images of the face roughly aligned by an of-the-shelf commercial frontal face detector. An extensive review of the literature on visual heart rate estimation identifies key factors limiting the performance and reproducibility of the methods as: (i) a lack of publicly available datasets and incomplete description of published experiments, (ii) use of unreliable pulse oximeters for the ground-truth reference, (iii) missing standard experimental protocols. A new challenging publicly available ECG-Fitness dataset with 205 sixty-second videos of subjects performing physical exercises is introduced. The dataset includes 17 subjects performing 4 activities (talking, rowing, exercising on a stepper and a stationary bike) captured by two RGB cameras, one attached to the currently used fitness machine that significantly vibrates, the other one to a separately standing tripod. With each subject, "rowing" and "talking" activity is repeated with a halogen lamp lighting. In case of 4 subjects, the whole recording session is also lighted by an LED light. HR-CNN outperforms the published methods on the dataset reducing error by more than a half. Each ECG-Fitness activity contains a different combination of realistic challenges. The HR-CNN method performs the best in case of the "rowing" activity with the mean absolute error 3.94, and the worst in case of the "talking" activity with the mean absolute error 15.57

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Sitting behaviour-based pattern recognition for predicting driver fatigue

    Full text link
    The proposed approach based on physiological characteristics of sitting behaviours and sophisticated machine learning techniques would enable an effective and practical solution to driver fatigue prognosis since it is insensitive to the illumination of driving environment, non-obtrusive to driver, without violating driver’s privacy, more acceptable by drivers

    A video-based technique for heart rate and eye blinks rate estimation: A potential solution for telemonitoring and remote healthcare

    Get PDF
    11noopenCurrent telemedicine and remote healthcare applications foresee different interactions between the doctor and the patient relying on the use of commercial and medical wearable sensors and internet-based video conferencing platforms. Nevertheless, the existing applications necessarily require a contact between the patient and sensors for an objective evaluation of the patient’s state. The proposed study explored an innovative video-based solution for monitoring neurophysiological parameters of potential patients and assessing their mental state. In particular, we investigated the possibility to estimate the heart rate (HR) and eye blinks rate (EBR) of participants while performing laboratory tasks by mean of facial—video analysis. The objectives of the study were focused on: (i) assessing the effectiveness of the proposed technique in estimating the HR and EBR by comparing them with laboratory sensor-based measures and (ii) assessing the capability of the video—based technique in discriminating between the participant’s resting state (Nominal condition) and their active state (Non-nominal condition). The results demonstrated that the HR and EBR estimated through the facial—video technique or the laboratory equipment did not statistically differ (p > 0.1), and that these neurophysiological parameters allowed to discriminate between the Nominal and Non-nominal states (p <0.02).openRonca V.; Giorgi A.; Rossi D.; Di Florio A.; Di Flumeri G.; AricĂČ P.; Sciaraffa N.; Vozzi A.; Tamborra L.; Simonetti I.; Borghini G.Ronca, V.; Giorgi, A.; Rossi, D.; Di Florio, A.; Di Flumeri, G.; AricĂČ, P.; Sciaraffa, N.; Vozzi, A.; Tamborra, L.; Simonetti, I.; Borghini, G
    • 

    corecore