6 research outputs found

    Crowdsourced data collection of facial responses

    Get PDF
    In the past, collecting data to train facial expression and affect recognition systems has been time consuming and often led to data that do not include spontaneous expressions. We present the first crowdsourced data collection of dynamic, natural and spontaneous facial responses as viewers watch media online. This system allowed a corpus of 3,268 videos to be collected in under two months. We characterize the data in terms of viewer demographics, position, scale, pose and movement of the viewer within the frame, and illumination of the facial region. We compare statistics from this corpus to those from the CK+ and MMI databases and show that distributions of position, scale, pose, movement and luminance of the facial region are significantly different from those represented in these datasets. We demonstrate that it is possible to efficiently collect massive amounts of ecologically valid responses, to known stimuli, from a diverse population using such a system. In addition facial feature points within the videos can be tracked for over 90% of the frames. These responses were collected without need for scheduling, payment or recruitment. Finally, we describe a subset of data (over 290 videos) that will be available for the research community.Things That Think ConsortiumProcter & Gamble Compan

    Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected In-the-Wild

    Get PDF
    Computer classification of facial expressions requires large amounts of data and this data needs to reflect the diversity of conditions seen in real applications. Public datasets help accelerate the progress of research by providing researchers with a benchmark resource. We present a comprehensively labeled dataset of ecologically valid spontaneous facial responses recorded in natural settings over the Internet. To collect the data, online viewers watched one of three intentionally amusing Super Bowl commercials and were simultaneously filmed using their webcam. They answered three self-report questions about their experience. A subset of viewers additionally gave consent for their data to be shared publicly with other researchers. This subset consists of 242 facial videos (168,359 frames) recorded in real world conditions. The dataset is comprehensively labeled for the following: 1) frame-by-frame labels for the presence of 10 symmetrical FACS action units, 4 asymmetric (unilateral) FACS action units, 2 head movements, smile, general expressiveness, feature tracker fails and gender; 2) the location of 22 automatically detected landmark points; 3) self-report responses of familiarity with, liking of, and desire to watch again for the stimuli videos and 4) baseline performance of detection algorithms on this dataset. This data is available for distribution to researchers online, the EULA can be found at: http://www.affectiva.com/facial-expression-dataset-am-fed/

    Predicting Online Media Effectiveness Based on Smile Responses Gathered Over the Internet

    Get PDF
    We present an automated method for classifying “liking” and “desire to view again” based on over 1,500 facial responses to media collected over the Internet. This is a very challenging pattern recognition problem that involves robust detection of smile intensities in uncontrolled settings and classification of naturalistic and spontaneous temporal data with large individual differences. We examine the manifold of responses and analyze the false positives and false negatives that result from classification. The results demonstrate the possibility for an ecologically valid, unobtrusive, evaluation of commercial “liking” and “desire to view again”, strong predictors of marketing success, based only on facial responses. The area under the curve for the best “liking” and “desire to view again” classifiers was 0.8 and 0.78 respectively when using a challenging leave-one-commercial-out testing regime. The technique could be employed in personalizing video ads that are presented to people whilst they view programming over the Internet or in copy testing of ads to unobtrusively quantify effectiveness.MIT Media Lab Consortiu

    Can the Crowd Tell How I Feel? Trait Empathy and Ethnic Background in a Visual Pain Judgment Task.

    Get PDF
    Many advocate for artificial agents to be empathic. Crowdsourc- ing could help, by facilitating human-in-the-loop approaches and dataset crea- tion for visual emotion recognition algorithms. Although crowdsourcing has been employed successfully for a range of tasks, it is not clear how effective crowdsourcing is when the task involves subjective rating of emotions. We ex- amined relationships between demographics, empathy and ethnic identity in pain emotion recognition tasks. Amazon MTurkers viewed images of strangers in painful settings, and tagged subjects’ emotions. They rated their level of pain arousal and confidence in their responses, and completed tests to gauge trait empathy and ethnic identity. We found that Caucasian participants were less confident than others, even when viewing other Caucasians in pain. Gender cor- related to word choices for describing images, though not to pain arousal or confidence. The results underscore the need for verified information on crowdworkers, to harness diversity effectively for metadata generation tasks

    What Your Face Vlogs About: Expressions of Emotion and Big-Five Traits Impressions in YouTube

    Get PDF
    Social video sites where people share their opinions and feelings are increasing in popularity. The face is known to reveal important aspects of human psychological traits, so the understanding of how facial expressions relate to personal constructs is a relevant problem in social media. We present a study of the connections between automatically extracted facial expressions of emotion and impressions of Big-Five personality traits in YouTube vlogs (i.e., video blogs). We use the Computer Expression Recognition Toolbox (CERT) system to characterize users of conversational vlogs. From CERT temporal signals corresponding to instantaneously recognized facial expression categories, we propose and derive four sets of behavioral cues that characterize face statistics and dynamics in a compact way. The cue sets are first used in a correlation analysis to assess the relevance of each facial expression of emotion with respect to Big-Five impressions obtained from crowd-observers watching vlogs, and also as features for automatic personality impression prediction. Using a dataset of 281 vloggers, the study shows that while multiple facial expression cues have significant correlation with several of the Big-Five traits, they are only able to significantly predict Extraversion impressions with moderate values of R-square

    Augmenting the performance of image similarity search through crowdsourcing

    Get PDF
    Crowdsourcing is defined as “outsourcing a task that is traditionally performed by an employee to a large group of people in the form of an open call” (Howe 2006). Many platforms designed to perform several types of crowdsourcing and studies have shown that results produced by crowds in crowdsourcing platforms are generally accurate and reliable. Crowdsourcing can provide a fast and efficient way to use the power of human computation to solve problems that are difficult for machines to perform. From several different microtasking crowdsourcing platforms available, we decided to perform our study using Amazon Mechanical Turk. In the context of our research we studied the effect of user interface design and its corresponding cognitive load on the performance of crowd-produced results. Our results highlighted the importance of a well-designed user interface on crowdsourcing performance. Using crowdsourcing platforms such as Amazon Mechanical Turk, we can utilize humans to solve problems that are difficult for computers, such as image similarity search. However, in tasks like image similarity search, it is more efficient to design a hybrid human–machine system. In the context of our research, we studied the effect of involving the crowd on the performance of an image similarity search system and proposed a hybrid human–machine image similarity search system. Our proposed system uses machine power to perform heavy computations and to search for similar images within the image dataset and uses crowdsourcing to refine results. We designed our content-based image retrieval (CBIR) system using SIFT, SURF, SURF128 and ORB feature detector/descriptors and compared the performance of the system using each feature detector/descriptor. Our experiment confirmed that crowdsourcing can dramatically improve the CBIR system performance
    corecore