Search CORE

132,702 research outputs found

Oops! Predicting Unintentional Action in Video

Author: Chen Boyuan
Epstein Dave
Vondrick Carl
Publication venue
Publication date: 25/11/2019
Field of study

From just a short glance at a video, we can often tell whether a person's action is intentional or not. Can we train a model to recognize this? We introduce a dataset of in-the-wild videos of unintentional action, as well as a suite of tasks for recognizing, localizing, and anticipating its onset. We train a supervised neural network as a baseline and analyze its performance compared to human consistency on the tasks. We also investigate self-supervised representations that leverage natural signals in our dataset, and show the effectiveness of an approach that uses the intrinsic speed of video to perform competitively with highly-supervised pretraining. However, a significant gap between machine and human performance remains. The project website is available at https://oops.cs.columbia.eduComment: 11 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Leveraging TCN and Transformer for effective visual-audio fusion in continuous emotion recognition

Author: Lu Jiada
Wang Weifeng
Xiong Zhaolong
Zhou Weiwei
Publication venue
Publication date: 06/09/2023
Field of study

Human emotion recognition plays an important role in human-computer interaction. In this paper, we present our approach to the Valence-Arousal (VA) Estimation Challenge, Expression (Expr) Classification Challenge, and Action Unit (AU) Detection Challenge of the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). Specifically, we propose a novel multi-modal fusion model that leverages Temporal Convolutional Networks (TCN) and Transformer to enhance the performance of continuous emotion recognition. Our model aims to effectively integrate visual and audio information for improved accuracy in recognizing emotions. Our model outperforms the baseline and ranks 3 in the Expression Classification challenge.Comment: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW

arXiv.org e-Print Archive

Wild Emptiness: A Zen Approach to Environmental Ethics

Author: Porter Arden D
Publication venue: UVM ScholarWorks
Publication date: 01/01/2019
Field of study

When Buddhism took root in China and integrated with the nation’s Taoist intellectual climate, the tradition retained the orthodox central objective of overcoming suffering. While conserving this principal aspiration, the rise of Zen is associated with deviation from the orthodox practice of monasticism and toward the practical embodiment of emptiness while integrated in society, which can be likened to the practice of unwavering compassion. This piece offers a Zen approach to environmental ethics such that it is an attempt to explicate how and why an individual practicing Zen should compassionately engage with the earth. With respect to the Buddhist employment of skillful means, different approaches are offered as tailored ethical frameworks to appeal to individuals at different stages in their path to awakening. Furthermore, the environmental ethic appealed to by the awakened individual is explicated as spontaneous harmonization with the flow of emptiness, that which the Buddhists regard to be the driving force of the phenomenal realm. The awakened individual is considered to non-deliberately take on a lifestyle that provides perpetual intimacy with the rhythms of wild emptines

ScholarWorks @ UVM

Facial Expression Recognition from World Wild Web

Author: Abdollahi Hojjat
Chan David
Hassani Behzad
Mahoor Mohammad H.
Mollahosseini Ali
Salvador Michelle J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/01/2017
Field of study

Recognizing facial expression in a wild setting has remained a challenging task in computer vision. The World Wide Web is a good source of facial images which most of them are captured in uncontrolled conditions. In fact, the Internet is a Word Wild Web of facial images with expressions. This paper presents the results of a new study on collecting, annotating, and analyzing wild facial expressions from the web. Three search engines were queried using 1250 emotion related keywords in six different languages and the retrieved images were mapped by two annotators to six basic expressions and neutral. Deep neural networks and noise modeling were used in three different training scenarios to find how accurately facial expressions can be recognized when trained on noisy images collected from the web using query terms (e.g. happy face, laughing man, etc)? The results of our experiments show that deep neural networks can recognize wild facial expressions with an accuracy of 82.12%

arXiv.org e-Print Archive

Crossref

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

Author: Gu Chunhui
Li Yeqing
Malik Jitendra
Pantofaru Caroline
Ricco Susanna
Ross David A.
Schmid Cordelia
Sukthankar Rahul
Sun Chen
Toderici George
Vijayanarasimhan Sudheendra
Vondrick Carl
Publication venue
Publication date: 30/04/2018
Field of study

This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1.58M action labels with multiple labels per person occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual actions, rather than composite actions; (2) precise spatio-temporal annotations with possibly multiple annotations for each person; (3) exhaustive annotation of these atomic actions over 15-minute video clips; (4) people temporally linked across consecutive segments; and (5) using movies to gather a varied set of action representations. This departs from existing datasets for spatio-temporal action recognition, which typically provide sparse annotations for composite actions in short video clips. We will release the dataset publicly. AVA, with its realistic scene and action complexity, exposes the intrinsic difficulty of action recognition. To benchmark this, we present a novel approach for action localization that builds upon the current state-of-the-art methods, and demonstrates better performance on JHMDB and UCF101-24 categories. While setting a new state of the art on existing datasets, the overall results on AVA are low at 15.6% mAP, underscoring the need for developing new approaches for video understanding.Comment: To appear in CVPR 2018. Check dataset page https://research.google.com/ava/ for detail

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Recommended from our members

Beware the animals that dance: conservation as an unintended outcome of cultural practices

Author: Bhagwat Shonil A.
Massey Ashley
Porodong Paul
Publication venue
Publication date: 01/01/2011
Field of study

The International Union for the Conservation of Nature (IUCN) World Parks Congress of 2003 and the Conference of Parties to the Convention on Biological Diversity (CBD) of 2004 call for the recognition and support of Community Conserved Areas, with the CBD Programme of Work on Protected Areas committing countries to take action by 2008. Both within protected areas and in the matrix of land beyond reserves, customs and beliefs of indigenous and local communities can yield conservation benefits. Identifying an intention to conserve by the custodians of customary conserved areas can be challenging as customary practices are embedded within a myriad of cosmologies and worldviews. However, the definition of Community Conserved Areas does not require an expressed intention to conserve nor does it specify the mechanisms by which nature or natural resources can be conserved. Thus, conservation as an unintended outcome of cultural practices is included within the scope of community conservation. Fieldwork was conducted in Sabah, Malaysian Borneo, from October 2010 to April 2011. Data for the case study of Gumantong comes from an interview with Porodong Mogilin,!Native Chief Representative of Matunggong Native Court in Bavanggazo, Kudat and meetings of community leaders from the 13 villages surrounding Gumantong. This paper 1) employs the case study of Gumantong in Sabah, Malaysian Borneo, to highlight the distinction between communities expressing an intention to conserve and conservation as an unintended outcome of cultural practices and 2) considers the implications of this distinction for the process of recognizing and supporting Community Conserved Areas

Open Research Online (The Open University)