1,004 research outputs found
Automatic Understanding of Image and Video Advertisements
There is more to images than their objective physical content: for example,
advertisements are created to persuade a viewer to take a certain action. We
propose the novel problem of automatic advertisement understanding. To enable
research on this problem, we create two datasets: an image dataset of 64,832
image ads, and a video dataset of 3,477 ads. Our data contains rich annotations
encompassing the topic and sentiment of the ads, questions and answers
describing what actions the viewer is prompted to take and the reasoning that
the ad presents to persuade the viewer ("What should I do according to this ad,
and why should I do it?"), and symbolic references ads make (e.g. a dove
symbolizes peace). We also analyze the most common persuasive strategies ads
use, and the capabilities that computer vision systems should have to
understand these strategies. We present baseline classification results for
several prediction tasks, including automatically answering questions about the
messages of the ads.Comment: To appear in CVPR 2017; data available on
http://cs.pitt.edu/~kovashka/ad
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
Informedia at TRECVID 2003: Analyzing and searching broadcast news video
We submitted a number of semantic classifiers, most of which were merely trained on keyframes. We also experimented with runs of classifiers were trained exclusively on text data and relative time within the video, while a few were trained using all available multiple modalities. 1.2 Interactive search This year, we submitted two runs using different versions of the Informedia systems. In one run, a version identical to last year's interactive system was used by five researchers, who split up the topics between themselves. The system interface emphasizes text queries, allowing search across ASR, closed captions and OCR text. The result set can then be manipulated through: ⢠storyboards of images spanning across video story segments ⢠emphasizing matching shots to a userâs query to reduce the image count to a manageable size ⢠resolution and layout under user control ⢠additional filtering provided through shot classifiers such as outdoors, and shots with people, etc. ⢠display of filter count and distribution to guide their use in manipulating storyboard views. In the best-performing interactive run, for all topics a single researcher used an improved version of the system, which allowed more effective browsing and visualization of the results of text queries using
10371 Abstracts Collection -- Dynamic Maps
From September 12th to 17th, 2010, the Dagstuhl Seminar 10371 ``Dynamic Maps \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Findings of Factify 2: Multimodal Fake News Detection
With social media usage growing exponentially in the past few years, fake
news has also become extremely prevalent. The detrimental impact of fake news
emphasizes the need for research focused on automating the detection of false
information and verifying its accuracy. In this work, we present the outcome of
the Factify 2 shared task, which provides a multi-modal fact verification and
satire news dataset, as part of the DeFactify 2 workshop at AAAI'23. The data
calls for a comparison based approach to the task by pairing social media
claims with supporting documents, with both text and image, divided into 5
classes based on multi-modal relations. In the second iteration of this task we
had over 60 participants and 9 final test-set submissions. The best
performances came from the use of DeBERTa for text and Swinv2 and CLIP for
image. The highest F1 score averaged for all five classes was 81.82%.Comment: Defactify2 @AAAI 202
Recent Developments in Video Surveillance
With surveillance cameras installed everywhere and continuously streaming thousands of hours of video, how can that huge amount of data be analyzed or even be useful? Is it possible to search those countless hours of videos for subjects or events of interest? Shouldnât the presence of a car stopped at a railroad crossing trigger an alarm system to prevent a potential accident? In the chapters selected for this book, experts in video surveillance provide answers to these questions and other interesting problems, skillfully blending research experience with practical real life applications. Academic researchers will find a reliable compilation of relevant literature in addition to pointers to current advances in the field. Industry practitioners will find useful hints about state-of-the-art applications. The book also provides directions for open problems where further advances can be pursued
- âŚ