19 research outputs found

    A low-cost cognitive assistant

    Get PDF
    In this paper, we present in depth the hardware components of a low-cost cognitive assistant. The aim is to detect the performance and the emotional state that elderly people present when performing exercises. Physical and cognitive exercises are a proven way of keeping elderly people active, healthy, and happy. Our goal is to bring to people that are at their homes (or in unsupervised places) an assistant that motivates them to perform exercises and, concurrently, monitor them, observing their physical and emotional responses. We focus on the hardware parts and the deep learning models so that they can be reproduced by others. The platform is being tested at an elderly people care facility, and validation is in process.This work was partly supported by the FCT (Funda莽茫o para a Ci锚ncia e Tecnolog铆a) through the Post-Doc scholarship SFRH/BPD/102696/2014 (A. Costa), by the Generalitat Valenciana (PROMETEO/2018/002), and by the Spanish Government (RTI2018-095390-B-C31)

    Multi-Branch Network for Imagery Emotion Prediction

    Full text link
    For a long time, images have proved perfect at both storing and conveying rich semantics, especially human emotions. A lot of research has been conducted to provide machines with the ability to recognize emotions in photos of people. Previous methods mostly focus on facial expressions but fail to consider the scene context, meanwhile scene context plays an important role in predicting emotions, leading to more accurate results. In addition, Valence-Arousal-Dominance (VAD) values offer a more precise quantitative understanding of continuous emotions, yet there has been less emphasis on predicting them compared to discrete emotional categories. In this paper, we present a novel Multi-Branch Network (MBN), which utilizes various source information, including faces, bodies, and scene contexts to predict both discrete and continuous emotions in an image. Experimental results on EMOTIC dataset, which contains large-scale images of people in unconstrained situations labeled with 26 discrete categories of emotions and VAD values, show that our proposed method significantly outperforms state-of-the-art methods with 28.4% in mAP and 0.93 in MAE. The results highlight the importance of utilizing multiple contextual information in emotion prediction and illustrate the potential of our proposed method in a wide range of applications, such as effective computing, human-computer interaction, and social robotics. Source code: https://github.com/BaoNinh2808/Multi-Branch-Network-for-Imagery-Emotion-PredictionComment: SOICT 202

    Programowana niania dla dziecka jako element Internetu Rzeczy

    Get PDF
    The aim of the work was to create a mobile application cooperating with Raspberry Pi. It is supposedto facilitate childcare by viewing the camera and notifying in case of noise. in the children's room. The mobileapplication was made in the Android Studio environment. The system is based on wireless Wi-Fi communication,which allows you to access data from the device in the whole area of operation of the local network. Data fromthe device and the user database are stored on the Apache server in the MySQL database. A temperature andhumidity sensor has been used, which allows to monitor conditions in the child's room and to develop andintegrate in the world of the Internet of Things (IoT).Celem pracy by艂o stworzenie aplikacji mobilnej wsp贸艂pracuj膮cej z Raspberry Pi. Ma ona u艂atwi膰opiek臋 nad dzieckiem, poprzez podgl膮d kamery i powiadomienia w razie ha艂asu w pokoju dziecka. Aplikacja mobilna zosta艂a wykonana w 艣rodowisku Android Studio. System opiera si臋o bezprzewodow膮 komunikacj臋WiFi, co pozwala na dost臋p do danych z urz膮dzenia w ca艂ym obszarze dzia艂ania sieci lokalnej. Dane z urz膮dzeniaoraz baza u偶ytkownik贸w s膮 przechowywane na serwerze Apache w bazie danych MySQL. Wykorzystano czujniktemperaturyi wilgotno艣ci, kt贸ry pozwala na monitorowanie warunk贸w panuj膮cych w pokoju dziecka oraz rozw贸ji integracj臋w 艣wiecie Internetu Rzeczy (ang. Internet of Things - IoT

    Bydgoski egzoszkielet na r臋k臋 - koncepcja i wyniki wst臋pne

    Get PDF
    Possibility of grasping and manipulation of various object constitute basic finctional abilities allowing for further development toward use of tools, hand writing, and other acitivies od daily living. This paper focuses onthe concept ofthe hand exoskeleton for adult patients as far as preliminary findings in th area of improvement of the parameters fo hand with deficit comapred to parameters in healthy hand. It causes not only immediate functional recovery but also shapes this recovery during next phases of the rehabilitation.Mo偶liwo艣膰 chwytu oraz manipulacje r贸偶nymi obiektami stanowi膮 podstawowe umiej臋tno艣ci funkcjonalne umo偶liwiaj膮ce dalsze przej艣cie do korzystania z narz臋dzi, pisania i innych czynno艣ci codziennego 偶ycia. W niniejszym artykule skupiono si臋 na koncepcji egzoszkieletu na r臋k臋 doros艂ego cz艂owieka oraz wst臋pnych wynikach w obszarze poprawy parametr贸w r臋ki dysfunkcyjnej w por贸wnaniu z r臋k膮 zdrow膮. Pozwala to nie tylko na natychmiastow膮 popraw臋 funkcji, ale r贸wnie偶 na kszta艂towanie jej w d艂u偶szym okresie czasu podczas dalszych faz rehabilitacji

    A Diaspora of Humans to Technology: VEDA Net for Sentiments and their Technical Analysis

    Get PDF
    Background: Human sentiments are the representation of one鈥檚 soul. Visual media has emerged as one of the most potent instruments for communicating thoughts and feelings in today's world. The area of visible emotion analysis is abstract due to the considerable amount of bias in the human cognitive process. Machines need to apprehend better and segment these for future AI advancements. A broad range of prior research has investigated only the emotion class identifier part of the whole process. In this work, we focus on proposing a better architecture to assess an emotion identifier and finding a better strategy to extract and process an input image for the architecture. Objective: We investigate the subject of visual emotion detection and analysis using a connected Dense Blocked Network to propose an architecture VEDANet. We show that the proposed architecture performed extremely effectively across different datasets. Method: Using CNN based pre-trained architectures, we would like to highlight the spatial hierarchies of visual features. Because the image's spatial regions communicate substantial feelings, we utilize a dense block-based model VEDANet that focuses on the image's relevant sentiment-rich regions for effective emotion extraction. This work makes a substantial addition by providing an in-depth investigation of the proposed architecture by carrying out extensive trials on popular benchmark datasets to assess accuracy gains over the comparable state-of-the-art. In terms of emotion detection, the outcomes of the study show that the proposed VED system outperforms the existing ones (accuracy). Further, we explore over the top optimization i.e. OTO layer to achieve higher efficiency. Results: When compared to the recent past research works, the proposed model performs admirably and obtains accuracy of 87.30% on the AffectNet dataset, 92.76% on Google FEC, 95.23% on Yale Dataset, and 97.63% on FER2013 dataset. We successfully merged the model with a face detector to obtain 98.34 percent accuracy on Real-Time live frames, further encouraging real-time applications. In comparison to existing approaches, we achieve real-time performance with a minimum TAT (Turn-around-Time) trade-off by using an appropriate network size and fewer parameters

    Hierachical Delta-Attention Method for Multimodal Fusion

    Full text link
    In vision and linguistics; the main input modalities are facial expressions, speech patterns, and the words uttered. The issue with analysis of any one mode of expression (Visual, Verbal or Vocal) is that lot of contextual information can get lost. This asks researchers to inspect multiple modalities to get a thorough understanding of the cross-modal dependencies and temporal context of the situation to analyze the expression. This work attempts at preserving the long-range dependencies within and across different modalities, which would be bottle-necked by the use of recurrent networks and adds the concept of delta-attention to focus on local differences per modality to capture the idiosyncrasy of different people. We explore a cross-attention fusion technique to get the global view of the emotion expressed through these delta-self-attended modalities, in order to fuse all the local nuances and global context together. The addition of attention is new to the multi-modal fusion field and currently being scrutinized for on what stage the attention mechanism should be used, this work achieves competitive accuracy for overall and per-class classification which is close to the current state-of-the-art with almost half number of parameters

    Visual and Lingual Emotion Recognition using Deep Learning Techniques

    Get PDF
    Emotion recognition has been an integral part of many applications like video games, cognitive computing, and human computer interaction. Emotion can be recognized by many sources including speech, facial expressions, hand gestures and textual attributes. We have developed a prototype emotion recognition system using computer vision and natural language processing techniques. Our goal hybrid system uses mobile camera frames and features abstracted from speech named Mel Frequency Cepstral Coefficient (MFCC) to recognize the emotion of a person. To acknowledge the emotions based on facial expressions, we have developed a Convolutional Neural Network (CNN) model, which has an accuracy of 68%. To recognize emotions based on Speech MFCCs, we have developed a sequential model with an accuracy of 69%. Out Android application can access the front and back camera simultaneously. This allows our application to predict the emotion of the overall conversation happening between the people facing both cameras. The application is also able to record the audio conversation between those people. The two emotions predicted (Face and Speech) are merged into one single emotion using the Fusion Algorithm. Our models are converted to TensorFlow-lite models to reduce the model size and support the limited processing power of mobile. Our system classifies emotions into seven classes: neutral, surprise, happy, fear, sad, disgust, and angr
    corecore