230 research outputs found

    Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

    Full text link
    Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

    The eyes know it: FakeET -- An Eye-tracking Database to Understand Deepfake Perception

    Full text link
    We present \textbf{FakeET}-- an eye-tracking database to understand human visual perception of \emph{deepfake} videos. Given that the principal purpose of deepfakes is to deceive human observers, FakeET is designed to understand and evaluate the ease with which viewers can detect synthetic video artifacts. FakeET contains viewing patterns compiled from 40 users via the \emph{Tobii} desktop eye-tracker for 811 videos from the \textit{Google Deepfake} dataset, with a minimum of two viewings per video. Additionally, EEG responses acquired via the \emph{Emotiv} sensor are also available. The compiled data confirms (a) distinct eye movement characteristics for \emph{real} vs \emph{fake} videos; (b) utility of the eye-track saliency maps for spatial forgery localization and detection, and (c) Error Related Negativity (ERN) triggers in the EEG responses, and the ability of the \emph{raw} EEG signal to distinguish between \emph{real} and \emph{fake} videos.Comment: 8 page

    An EEG-Based Image Annotation System

    Get PDF
    The success of deep learning in computer vision has greatly increased the need for annotated image datasets. We propose an EEG (Electroencephalogram)-based image annotation system. While humans can recognize objects in 20–200 ms, the need to manually label images results in a low annotation throughput. Our system employs brain signals captured via a consumer EEG device to achieve an annotation rate of up to 10 images per second. We exploit the P300 event-related potential (ERP) signature to identify target images during a rapid serial visual presentation (RSVP) task. We further perform unsupervised outlier removal to achieve an F1-score of 0.88 on the test set. The proposed system does not depend on category-specific EEG signatures enabling the annotation of any new image category without any model pre-training

    EEG-based biometrics: Effects of template ageing

    Get PDF
    This chapter discusses the effects of template ageing in EEG-based biometrics. The chapter also serves as an introduction to general biometrics and its main tasks: Identification and verification. To do so, we investigate different characterisations of EEG signals and examine the difference of performance in subject identification between single session and cross-session identification experiments. In order to do this, EEG signals are characterised with common state-of-the-art features, i.e. Mel Frequency Cepstral Coefficients (MFCC), Autoregression Coefficients, and Power Spectral Density-derived features. The samples were later classified using various classifiers, including Support Vector Machines and k-Nearest Neighbours with different parametrisations. Results show that performance tends to be worse for crosssession identification compared to single session identification. This finding suggests that temporal permanence of EEG signals is limited and thus more sophisticated methods are needed in order to characterise EEG signals for the task of subject identificatio

    Collaborative Brain-Computer Interfaces in Rapid Image Presentation and Motion Pictures

    Get PDF
    The last few years have seen an increase in brain-computer interface (BCI) research for the able-bodied population. One of these new branches involves collaborative BCIs (cBCIs), in which information from several users is combined to improve the performance of a BCI system. This thesis is focused on cBCIs with the aim of increasing understanding of how they can be used to improve performance of single-user BCIs based on event-related potentials (ERPs). The objectives are: (1) to study and compare different methods of creating groups using exclusively electroencephalography (EEG) signals, (2) to develop a theoretical model to establish where the highest gains may be expected from creating groups, and (3) to analyse the information that can be extracted by merging signals from multiple users. For this, two scenarios involving real-world stimuli (images presented at high rates and movies) were studied. The first scenario consisted of a visual search task in which images were presented at high frequencies. Three modes of combining EEG recordings from different users were tested to improve the detection of different ERPs, namely the P300 (associated with the presence of events of interest) and the N2pc (associated with shifts of attention). We showed that the detection and localisation of targets can improve significantly when information from multiple viewers is combined. In the second scenario, feature movies were introduced to study variations in ERPs in response to cuts through cBCI techniques. A distinct, previously unreported, ERP appears in relation to such cuts, the amplitude of which is not modulated by visual effects such as the low-level properties of the frames surrounding the discontinuity. However, significant variations that depended on the movie were found. We hypothesise that these techniques can be used to build on the attentional theory of cinematic continuity by providing an extra source of information: the brain
    corecore