19 research outputs found
A low-cost cognitive assistant
In this paper, we present in depth the hardware components of a low-cost cognitive assistant. The aim is to detect the performance and the emotional state that elderly people present when performing exercises. Physical and cognitive exercises are a proven way of keeping elderly people active, healthy, and happy. Our goal is to bring to people that are at their homes (or in unsupervised places) an assistant that motivates them to perform exercises and, concurrently, monitor them, observing their physical and emotional responses. We focus on the hardware parts and the deep learning models so that they can be reproduced by others. The platform is being tested at an elderly people care facility, and validation is in process.This work was partly supported by the FCT (Funda莽茫o para a Ci锚ncia e Tecnolog铆a) through the Post-Doc scholarship SFRH/BPD/102696/2014 (A. Costa), by the Generalitat Valenciana (PROMETEO/2018/002), and by the Spanish Government (RTI2018-095390-B-C31)
Multi-Branch Network for Imagery Emotion Prediction
For a long time, images have proved perfect at both storing and conveying
rich semantics, especially human emotions. A lot of research has been conducted
to provide machines with the ability to recognize emotions in photos of people.
Previous methods mostly focus on facial expressions but fail to consider the
scene context, meanwhile scene context plays an important role in predicting
emotions, leading to more accurate results. In addition,
Valence-Arousal-Dominance (VAD) values offer a more precise quantitative
understanding of continuous emotions, yet there has been less emphasis on
predicting them compared to discrete emotional categories. In this paper, we
present a novel Multi-Branch Network (MBN), which utilizes various source
information, including faces, bodies, and scene contexts to predict both
discrete and continuous emotions in an image. Experimental results on EMOTIC
dataset, which contains large-scale images of people in unconstrained
situations labeled with 26 discrete categories of emotions and VAD values, show
that our proposed method significantly outperforms state-of-the-art methods
with 28.4% in mAP and 0.93 in MAE. The results highlight the importance of
utilizing multiple contextual information in emotion prediction and illustrate
the potential of our proposed method in a wide range of applications, such as
effective computing, human-computer interaction, and social robotics. Source
code:
https://github.com/BaoNinh2808/Multi-Branch-Network-for-Imagery-Emotion-PredictionComment: SOICT 202
Programowana niania dla dziecka jako element Internetu Rzeczy
The aim of the work was to create a mobile application cooperating with Raspberry Pi. It is supposedto facilitate childcare by viewing the camera and notifying in case of noise. in the children's room. The mobileapplication was made in the Android Studio environment. The system is based on wireless Wi-Fi communication,which allows you to access data from the device in the whole area of operation of the local network. Data fromthe device and the user database are stored on the Apache server in the MySQL database. A temperature andhumidity sensor has been used, which allows to monitor conditions in the child's room and to develop andintegrate in the world of the Internet of Things (IoT).Celem pracy by艂o stworzenie aplikacji mobilnej wsp贸艂pracuj膮cej z Raspberry Pi. Ma ona u艂atwi膰opiek臋 nad dzieckiem, poprzez podgl膮d kamery i powiadomienia w razie ha艂asu w pokoju dziecka. Aplikacja mobilna zosta艂a wykonana w 艣rodowisku Android Studio. System opiera si臋o bezprzewodow膮 komunikacj臋WiFi, co pozwala na dost臋p do danych z urz膮dzenia w ca艂ym obszarze dzia艂ania sieci lokalnej. Dane z urz膮dzeniaoraz baza u偶ytkownik贸w s膮 przechowywane na serwerze Apache w bazie danych MySQL. Wykorzystano czujniktemperaturyi wilgotno艣ci, kt贸ry pozwala na monitorowanie warunk贸w panuj膮cych w pokoju dziecka oraz rozw贸ji integracj臋w 艣wiecie Internetu Rzeczy (ang. Internet of Things - IoT
Bydgoski egzoszkielet na r臋k臋 - koncepcja i wyniki wst臋pne
Possibility of grasping and manipulation of various object constitute basic finctional abilities allowing for further development toward use of tools, hand writing, and other acitivies od daily living. This paper focuses onthe concept ofthe hand exoskeleton for adult patients as far as preliminary findings in th area of improvement of the parameters fo hand with deficit comapred to parameters in healthy hand. It causes not only immediate functional recovery but also shapes this recovery during next phases of the rehabilitation.Mo偶liwo艣膰 chwytu oraz manipulacje r贸偶nymi obiektami stanowi膮 podstawowe umiej臋tno艣ci funkcjonalne umo偶liwiaj膮ce dalsze przej艣cie do korzystania z narz臋dzi, pisania i innych czynno艣ci codziennego 偶ycia. W niniejszym artykule skupiono si臋 na koncepcji egzoszkieletu na r臋k臋 doros艂ego cz艂owieka oraz wst臋pnych wynikach w obszarze poprawy parametr贸w r臋ki dysfunkcyjnej w por贸wnaniu z r臋k膮 zdrow膮. Pozwala to nie tylko na natychmiastow膮 popraw臋 funkcji, ale r贸wnie偶 na kszta艂towanie jej w d艂u偶szym okresie czasu podczas dalszych faz rehabilitacji
A Diaspora of Humans to Technology: VEDA Net for Sentiments and their Technical Analysis
Background: Human sentiments are the representation of one鈥檚 soul. Visual media has emerged as one of the most potent instruments for communicating thoughts and feelings in today's world. The area of visible emotion analysis is abstract due to the considerable amount of bias in the human cognitive process. Machines need to apprehend better and segment these for future AI advancements. A broad range of prior research has investigated only the emotion class identifier part of the whole process. In this work, we focus on proposing a better architecture to assess an emotion identifier and finding a better strategy to extract and process an input image for the architecture.
Objective: We investigate the subject of visual emotion detection and analysis using a connected Dense Blocked Network to propose an architecture VEDANet. We show that the proposed architecture performed extremely effectively across different datasets.
Method: Using CNN based pre-trained architectures, we would like to highlight the spatial hierarchies of visual features. Because the image's spatial regions communicate substantial feelings, we utilize a dense block-based model VEDANet that focuses on the image's relevant sentiment-rich regions for effective emotion extraction. This work makes a substantial addition by providing an in-depth investigation of the proposed architecture by carrying out extensive trials on popular benchmark datasets to assess accuracy gains over the comparable state-of-the-art. In terms of emotion detection, the outcomes of the study show that the proposed VED system outperforms the existing ones (accuracy). Further, we explore over the top optimization i.e. OTO layer to achieve higher efficiency.
Results: When compared to the recent past research works, the proposed model performs admirably and obtains accuracy of 87.30% on the AffectNet dataset, 92.76% on Google FEC, 95.23% on Yale Dataset, and 97.63% on FER2013 dataset. We successfully merged the model with a face detector to obtain 98.34 percent accuracy on Real-Time live frames, further encouraging real-time applications. In comparison to existing approaches, we achieve real-time performance with a minimum TAT (Turn-around-Time) trade-off by using an appropriate network size and fewer parameters
Hierachical Delta-Attention Method for Multimodal Fusion
In vision and linguistics; the main input modalities are facial expressions,
speech patterns, and the words uttered. The issue with analysis of any one mode
of expression (Visual, Verbal or Vocal) is that lot of contextual information
can get lost. This asks researchers to inspect multiple modalities to get a
thorough understanding of the cross-modal dependencies and temporal context of
the situation to analyze the expression. This work attempts at preserving the
long-range dependencies within and across different modalities, which would be
bottle-necked by the use of recurrent networks and adds the concept of
delta-attention to focus on local differences per modality to capture the
idiosyncrasy of different people. We explore a cross-attention fusion technique
to get the global view of the emotion expressed through these
delta-self-attended modalities, in order to fuse all the local nuances and
global context together. The addition of attention is new to the multi-modal
fusion field and currently being scrutinized for on what stage the attention
mechanism should be used, this work achieves competitive accuracy for overall
and per-class classification which is close to the current state-of-the-art
with almost half number of parameters
Visual and Lingual Emotion Recognition using Deep Learning Techniques
Emotion recognition has been an integral part of many applications like video games, cognitive computing, and human computer interaction. Emotion can be recognized by many sources including speech, facial expressions, hand gestures and textual attributes. We have developed a prototype emotion recognition system using computer vision and natural language processing techniques. Our goal hybrid system uses mobile camera frames and features abstracted from speech named Mel Frequency Cepstral Coefficient (MFCC) to recognize the emotion of a person. To acknowledge the emotions based on facial expressions, we have developed a Convolutional Neural Network (CNN) model, which has an accuracy of 68%. To recognize emotions based on Speech MFCCs, we have developed a sequential model with an accuracy of 69%. Out Android application can access the front and back camera simultaneously. This allows our application to predict the emotion of the overall conversation happening between the people facing both cameras. The application is also able to record the audio conversation between those people. The two emotions predicted (Face and Speech) are merged into one single emotion using the Fusion Algorithm. Our models are converted to TensorFlow-lite models to reduce the model size and support the limited processing power of mobile. Our system classifies emotions into seven classes: neutral, surprise, happy, fear, sad, disgust, and angr