1,439 research outputs found
Estimating Carotid Pulse and Breathing Rate from Near-infrared Video of the Neck
Objective: Non-contact physiological measurement is a growing research area
that allows capturing vital signs such as heart rate (HR) and breathing rate
(BR) comfortably and unobtrusively with remote devices. However, most of the
approaches work only in bright environments in which subtle
photoplethysmographic and ballistocardiographic signals can be easily analyzed
and/or require expensive and custom hardware to perform the measurements.
Approach: This work introduces a low-cost method to measure subtle motions
associated with the carotid pulse and breathing movement from the neck using
near-infrared (NIR) video imaging. A skin reflection model of the neck was
established to provide a theoretical foundation for the method. In particular,
the method relies on template matching for neck detection, Principal Component
Analysis for feature extraction, and Hidden Markov Models for data smoothing.
Main Results: We compared the estimated HR and BR measures with ones provided
by an FDA-cleared device in a 12-participant laboratory study: the estimates
achieved a mean absolute error of 0.36 beats per minute and 0.24 breaths per
minute under both bright and dark lighting.
Significance: This work advances the possibilities of non-contact
physiological measurement in real-life conditions in which environmental
illumination is limited and in which the face of the person is not readily
available or needs to be protected. Due to the increasing availability of NIR
imaging devices, the described methods are readily scalable.Comment: 21 pages, 15 figure
Contact-Free Heart Rate Measurement From Human Face Videos and its Biometric Recognition Application
An In-Vehicle Vision-Based Driver's Drowsiness Detection System
[[abstract]]Many traffic accidents have been reported due to driverâs drowsiness/fatigue. Drowsiness degrades driving performance due to the declinations of visibility, situational awareness and decision-making capability. In this study, a vision-based drowsiness detection and warning system is presented, which attempts to bring to the attention of a driver to his/her own potential drowsiness. The information provided by the system can also be utilized by adaptive systems to manage noncritical operations, such as starting a ventilator, spreading fragrance, turning on a radio, and providing entertainment options. In high drowsiness situation, the system may initiate navigation aids and alert others to the drowsiness of the driver.
The system estimates the fatigue level of a driver based on his/her facial images acquired by a video camera mounted in the front of the vehicle. There are five major steps involved in the system process: preprocessing, facial feature extraction, face tracking, parameter estimation, and reasoning. In the preprocessing step, the input image is sub-sampled for reducing the image size and in turn the processing time. A lighting compensation process is next applied to the reduced image in order to remove the influences of ambient illumination variations. Afterwards, for each image pixel a number of chrominance values are calculated, which are to be used in the next step for detecting facial features.
There are four sub-steps constituting the feature extraction step: skin detection, face localization, eyes and mouth detection, and feature confirmation. To begin, the skin areas are located in the image based on the chrominance values of pixels calculated in the previous step and a predefined skin model. We next search for the face region within the largest skin area. However, the detected face is typically imperfect. Facial feature detection within the imperfect face region is unreliable. We actually look for facial features throughout the entire image. As to the face region, it will later be used to confirm the detected facial features. Once facial features are located, they are tracked over the video sequence until they are missed detecting in a video image. At this moment, the facial feature detection process is revoked again. Although facial feature detection is time consuming, facial feature tracking is fast and reliable.
During facial feature tracking, parameters of facial expression, including percentage of eye closure over time, eye blinking frequency, durations of eye closure, gaze and mouth opening, as well as head orientation, are estimated. The estimated parameters are then utilized in the reasoning step to determine the driverâs drowsiness level. A fuzzy integral technique is employed, which integrates various types of parameter values to arrive at a decision about the drowsiness level of the driver. A number of video sequences of different drivers and illumination conditions have been tested. The results revealed that our system can work reasonably in daytime. We may extend the system in the future work to apply in nighttime. For this, infrared sensors should be included.
State of the art of audio- and video based solutions for AAL
Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to oneâs activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individualsâ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach.
This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users.
The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted.
The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio
Robust Visual Heart Rate Estimation
Je pĆedstavena novĂĄ metoda odhadu srdeÄnĂ frekvence, HR-CNN - dvoustupĆovĂĄ konvoluÄnĂ neuronovĂĄ sĂĆ„. SĂĆ„ je trĂ©novĂĄna end-to-end alternujĂcĂ optimalizacĂ a je robustnĂ vĆŻÄi zmÄnĂĄm osvÄtlenĂ a relativnĂmu pohybu snĂmanĂ©ho objektu a kamery. SĂĆ„ funguje dobĆe s nepĆesnÄ registrovanĂœm obliÄejem z komerÄnĂho obliÄejovĂ©ho detektoru. Z rozsĂĄhlĂ©ho rozboru relevantnĂch zdrojĆŻ vyplĂœvajĂ klĂÄovĂ© faktory omezujĂcĂ pĆesnost a reprodukovatelnost metod jako: (i) nedostatek veĆejnÄ dostupnĂœch datovĂœch sad a nedostateÄnÄ popsanĂ© experimenty v publikovanĂœch ÄlĂĄncĂch, (ii) pouĆŸitĂ nespolehlivĂ©ho pulznĂho oximetru pro referenÄnĂ ground-truth, (iii) chybÄjĂcĂ standardnĂ experimentĂĄlnĂ protokoly. Je pĆedstavena novĂĄ veĆejnÄ dostupnĂĄ datovĂĄ sada ECG-Fitness, kterĂĄ obsahuje 205 minutovĂœch videĂ, v nichĆŸ 17 dobrovolnĂkĆŻ cviÄĂ na posilovacĂch strojĂch. DobrovolnĂci provĂĄdĂ celkem 4 aktivity (rozhovor, veslovĂĄnĂ, cviÄenĂ na stepperu a na rotopedu). KaĆŸdĂĄ aktivita je zachycena dvÄma RGB kamerami, z nichĆŸ jedna je pĆipevnÄna k prĂĄvÄ pouĆŸĂvanĂ©mu posilovacĂmu stroji, kterĂœ vĂœraznÄ vibruje, a druhĂĄ je uchycena na samostatnÄ stojĂcĂm stativu. Aktivity "veslovĂĄnĂ" a "rozhovor" opakujĂ dobrovolnĂci dvakrĂĄt. PĆi druhĂ©m opakovĂĄnĂ jsou osvÄtleni halogenovou lampou. 4 dobrovolnĂci jsou osvÄtleni LED svÄtlem ve vĆĄech ĆĄesti videĂch. HR-CNN mĂĄ o vĂce jak polovinu lepĆĄĂ vĂœsledky neĆŸ dosud publikovanĂ© metody. KaĆŸdĂĄ aktivita v ECG-Fitness datasetu pĆedstavuje jinou kombinaci realistickĂœch vĂœzev. HR-CNN mĂĄ nejlepĆĄĂ vĂœsledky v pĆĂpadÄ aktivity "veslovĂĄnĂ" s prĆŻmÄrnou absolutnĂ chybou 3.94 a nejhorĆĄĂ v pĆĂpadÄ aktivity "rozhovor" s prĆŻmÄrnou absolutnĂ chybou 15.57.A novel heart rate estimator, HR-CNN - a two-step convolutional neural network, is presented. The network is trained end-to-end by alternating optimization to be robust to illumination changes and relative movement of the subject and the camera. The network works well with images of the face roughly aligned by an of-the-shelf commercial frontal face detector. An extensive review of the literature on visual heart rate estimation identifies key factors limiting the performance and reproducibility of the methods as: (i) a lack of publicly available datasets and incomplete description of published experiments, (ii) use of unreliable pulse oximeters for the ground-truth reference, (iii) missing standard experimental protocols. A new challenging publicly available ECG-Fitness dataset with 205 sixty-second videos of subjects performing physical exercises is introduced. The dataset includes 17 subjects performing 4 activities (talking, rowing, exercising on a stepper and a stationary bike) captured by two RGB cameras, one attached to the currently used fitness machine that significantly vibrates, the other one to a separately standing tripod. With each subject, "rowing" and "talking" activity is repeated with a halogen lamp lighting. In case of 4 subjects, the whole recording session is also lighted by an LED light. HR-CNN outperforms the published methods on the dataset reducing error by more than a half. Each ECG-Fitness activity contains a different combination of realistic challenges. The HR-CNN method performs the best in case of the "rowing" activity with the mean absolute error 3.94, and the worst in case of the "talking" activity with the mean absolute error 15.57
Models and Analysis of Vocal Emissions for Biomedical Applications
The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy
Sitting behaviour-based pattern recognition for predicting driver fatigue
The proposed approach based on physiological characteristics of sitting behaviours and sophisticated machine learning techniques would enable an effective and practical solution to driver fatigue prognosis since it is insensitive to the illumination of driving environment, non-obtrusive to driver, without violating driver’s privacy, more acceptable by drivers
A video-based technique for heart rate and eye blinks rate estimation: A potential solution for telemonitoring and remote healthcare
11noopenCurrent telemedicine and remote healthcare applications foresee different interactions between the doctor and the patient relying on the use of commercial and medical wearable sensors and internet-based video conferencing platforms. Nevertheless, the existing applications necessarily require a contact between the patient and sensors for an objective evaluation of the patientâs state. The proposed study explored an innovative video-based solution for monitoring neurophysiological parameters of potential patients and assessing their mental state. In particular, we investigated the possibility to estimate the heart rate (HR) and eye blinks rate (EBR) of participants while performing laboratory tasks by mean of facialâvideo analysis. The objectives of the study were focused on: (i) assessing the effectiveness of the proposed technique in estimating the HR and EBR by comparing them with laboratory sensor-based measures and (ii) assessing the capability of the videoâbased technique in discriminating between the participantâs resting state (Nominal condition) and their active state (Non-nominal condition). The results demonstrated that the HR and EBR estimated through the facialâvideo technique or the laboratory equipment did not statistically differ (p > 0.1), and that these neurophysiological parameters allowed to discriminate between the Nominal and Non-nominal states (p <0.02).openRonca V.; Giorgi A.; Rossi D.; Di Florio A.; Di Flumeri G.; AricĂČ P.; Sciaraffa N.; Vozzi A.; Tamborra L.; Simonetti I.; Borghini G.Ronca, V.; Giorgi, A.; Rossi, D.; Di Florio, A.; Di Flumeri, G.; AricĂČ, P.; Sciaraffa, N.; Vozzi, A.; Tamborra, L.; Simonetti, I.; Borghini, G
- âŠ