Search CORE

2,547 research outputs found

Dublin City University video track experiments for TREC 2002

Author: Browne Paul
Czirjék Csaba
Gurrin Cathal
Jarina Roman
Lee Hyowon
Marlow Seán
McDonald Kieran
Murphy Noel
O'Connor Noel E.
Smeaton Alan F.
Ye Jiamin
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2002
Field of study

Dublin City University participated in the Feature Extraction task and the Search task of the TREC-2002 Video Track. In the Feature Extraction task, we submitted 3 features: Face, Speech, and Music. In the Search task, we developed an interactive video retrieval system, which incorporated the 40 hours of the video search test collection and supported user searching using our own feature extraction data along with the donated feature data and ASR transcript from other Video Track groups. This video retrieval system allows a user to specify a query based on the 10 features and ASR transcript, and the query result is a ranked list of videos that can be further browsed at the shot level. To evaluate the usefulness of the feature-based query, we have developed a second system interface that provides only ASR transcript-based querying, and we conducted an experiment with 12 test users to compare these 2 systems. Results were submitted to NIST and we are currently conducting further analysis of user performance with these 2 systems

DCU Online Research Access Service

Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements

Author: Kankanhalli Mohan
Katti Harish
Shukla Abhinav
Subramanian Ramanathan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/08/2018
Field of study

Emotion evoked by an advertisement plays a key role in influencing brand recall and eventual consumer choices. Automatic ad affect recognition has several useful applications. However, the use of content-based feature representations does not give insights into how affect is modulated by aspects such as the ad scene setting, salient object attributes and their interactions. Neither do such approaches inform us on how humans prioritize visual information for ad understanding. Our work addresses these lacunae by decomposing video content into detected objects, coarse scene structure, object statistics and actively attended objects identified via eye-gaze. We measure the importance of each of these information channels by systematically incorporating related information into ad affect prediction models. Contrary to the popular notion that ad affect hinges on the narrative and the clever use of linguistic and social cues, we find that actively attended objects and the coarse scene structure better encode affective information as compared to individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International Conference on Multimodal Interaction, Boulder, CO, US

arXiv.org e-Print Archive

University of Canberra Research Repository

Open Access Repository of IISc Research Publications

A framework for realistic 3D tele-immersion

Author: Alexiadis D.
Alexiadis D.
Broeck S.V.
Broeck S.V.
Cesar P.
Cesar P.
Daras P.
Daras P.
Eisert P.
Eisert P.
Fechteler P.
Fechteler P.
Hilsmann A.
Hilsmann A.
Kuijk F.
Kuijk F.
Mauro D.A.
Mauro D.A.
Mekuria R.
Mekuria R.
Monaghan D.
Monaghan D.
O'Connor N.E.
O'Connor N.E.
Sanna M.
Sanna M.
Stevens C.
Stevens C.
Wall J.
Wall J.
Zahariadis T.
Zahariadis T.
Publication venue
Publication date: 01/01/2013
Field of study

Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems

UEL Research Repository at University of East London

Crossref

CWI's Institutional Repository

Fraunhofer-ePrints

Irish Universities

DCU Online Research Access Service

Project RISE: Recognizing Industrial Smoke Emissions

Author: Dille Paul
Hoffman Ryan
Hsu Yen-Chia
Hu Ting-Yao
Huang Ting-Hao 'Kenneth'
Nourbakhsh Illah
Pachuta Jessica
Prendi Sean
Sargent Randy
Tsuhlares Anastasia
Publication venue
Publication date: 14/09/2020
Field of study

Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for social good.Comment: Technical repor

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Online Action Detection

Author: De Geest Roeland
Gavves Efstratios
Ghodrati Amir
Li Zhenyang
Snoek Cees
Tuytelaars Tinne
Publication venue
Publication date: 01/01/2016
Field of study

In online action detection, the goal is to detect the start of an action in a video stream as soon as it happens. For instance, if a child is chasing a ball, an autonomous car should recognize what is going on and respond immediately. This is a very challenging problem for four reasons. First, only partial actions are observed. Second, there is a large variability in negative data. Third, the start of the action is unknown, so it is unclear over what time window the information should be integrated. Finally, in real world data, large within-class variability exists. This problem has been addressed before, but only to some extent. Our contributions to online action detection are threefold. First, we introduce a realistic dataset composed of 27 episodes from 6 popular TV series. The dataset spans over 16 hours of footage annotated with 30 action classes, totaling 6,231 action instances. Second, we analyze and compare various baseline methods, showing this is a challenging problem for which none of the methods provides a good solution. Third, we analyze the change in performance when there is a variation in viewpoint, occlusion, truncation, etc. We introduce an evaluation protocol for fair comparison. The dataset, the baselines and the models will all be made publicly available to encourage (much needed) further research on online action detection on realistic data.Comment: Project page: http://homes.esat.kuleuven.be/~rdegeest/OnlineActionDetection.htm

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications