6 research outputs found

    Assessing the emotional impact of video using machine learning techniques

    Get PDF
    Typically, when a human being watches a video, different sensations and mind states can be stimulated. Among these, the sensation of fear can be triggered by watching segments of movies containing themes such as violence, horror and suspense. Both the audio and visual stimuli may contribute to induce fear onto the viewer. This dissertation studies the use of machine learning for forecasting the emotional effects triggered by video, more precisely, the automatic identification of fear inducing video segments. Using the LIRIS-ACCEDE dataset, several experiments have been performed in order to identify feature sets that are most relevant to the problem and to assess the performance of different machine learning classifiers. Both classical and deep learning techniques have been implemented and evaluated, using the Scikit-learn and TensorFlow machine learning libraries. Two different approaches for training and testing have been followed: film-level dataset splitting, where different films were used for training and testing; and sample-level dataset splitting, which allowed that different samples coming from the same films were used for training and testing. The prediction of movie segments that trigger fear sensations achieved a F1-score of 18.5% in the first approach, a value suggesting that the dataset does not adequately represent the universe of movies. The second approach achieved a F1-score of about 84.0%, a substantially higher value that shows promising outcomes when performing the proposed task.Quando o ser humano assiste a filmes, diferentes sensações e estados de espírito são despoletados. Entre estes encontra-se o medo, que pode ser despoletado através da visualização de excertos de filmes contendo, por exemplo, violência gráfica, horror ou suspense. Tanto a componente visual como a auditiva contribuem para o despoletar desta sensação. Nesta dissertação é analisada a utilização de aprendizagem automática para prever o impacto emocional que a visualização de vídeos possa causar nas pessoas, mais concretamente os segmentos de um filme que despoletam a sensação de medo. Foram realizadas diversas experiências usando o conjunto de dados LIRIS-ACCEDE com os objetivos de encontrar conjuntos de atributos de imagem e áudio com maior relevância para o problema e de avaliar o desempenho de diversos modelos de aprendizagem automática usados para classificação. Foram usados diversos algoritmos clássicos e de aprendizagem profunda, recorrendo-se às bibliotecas Scikit-learn e TensorFlow. No que se refere à separação dos dados usados para treino e teste foram seguidas duas abordagens: divisão dos dados ao nível do filme, sendo usados filmes distintos para treino e teste; e divisão dos dados ao nível da amostra, possibilitando que os conjuntos de treino e teste contenham amostras distintas, mas pertencentes aos mesmos filmes. Para previsão dos segmentos que despoletam medo, na primeira abordagem chegou-se a um resultado de F1-score de 18,5%, concluindo-se que o conjunto de dados usado não é representativo, e na segunda abordagem a um F1-score de 84,0%, um valor substancialmente mais alto e promissor no desempenho da tarefa proposta

    Learning Emotion Representations from Verbal and Nonverbal Communication

    Full text link
    Emotion understanding is an essential but highly challenging component of artificial general intelligence. The absence of extensively annotated datasets has significantly impeded advancements in this field. We present EmotionCLIP, the first pre-training paradigm to extract visual emotion representations from verbal and nonverbal communication using only uncurated data. Compared to numerical labels or descriptions used in previous methods, communication naturally contains emotion information. Furthermore, acquiring emotion representations from communication is more congruent with the human learning process. We guide EmotionCLIP to attend to nonverbal emotion cues through subject-aware context encoding and verbal emotion cues using sentiment-guided contrastive learning. Extensive experiments validate the effectiveness and transferability of EmotionCLIP. Using merely linear-probe evaluation protocol, EmotionCLIP outperforms the state-of-the-art supervised visual emotion recognition methods and rivals many multimodal approaches across various benchmarks. We anticipate that the advent of EmotionCLIP will address the prevailing issue of data scarcity in emotion understanding, thereby fostering progress in related domains. The code and pre-trained models are available at https://github.com/Xeaver/EmotionCLIP.Comment: CVPR 202

    MMPosE: Movie-induced multi-label positive emotion classification through EEG signals

    Get PDF
    Emotional information plays an important role in various multimedia applications. Movies, as a widely available form of multimedia content, can induce multiple positive emotions and stimulate people's pursuit of a better life. Different from negative emotions, positive emotions are highly correlated and difficult to distinguish in the emotional space. Since different positive emotions are often induced simultaneously by movies, traditional single-target or multi-class methods are not suitable for the classification of movie-induced positive emotions. In this paper, we propose TransEEG, a model for multi-label positive emotion classification from a viewer's brain activities when watching emotional movies. The key features of TransEEG include (1) explicitly modeling the spatial correlation and temporal dependencies of multi-channel EEG signals using the Transformer structure based model, which effectively addresses long-distance dependencies, (2) exploiting the label-label correlations to guide the discriminative EEG representation learning, for that we design an Inter-Emotion Mask for guiding the Multi-Head Attention to learn the inter-emotion correlations, and (3) constructing an attention score vector from the representation-label correlation matrix to refine emotion-relevant EEG features. To evaluate the ability of our model for multi-label positive emotion classification, we demonstrate our model on a state-of-the-art positive emotion database CPED. Extensive experimental results show that our proposed method achieves superior performance over the competitive approaches
    corecore