1,004 research outputs found

    Automatic Understanding of Image and Video Advertisements

    Full text link
    There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action. We propose the novel problem of automatic advertisement understanding. To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads. Our data contains rich annotations encompassing the topic and sentiment of the ads, questions and answers describing what actions the viewer is prompted to take and the reasoning that the ad presents to persuade the viewer ("What should I do according to this ad, and why should I do it?"), and symbolic references ads make (e.g. a dove symbolizes peace). We also analyze the most common persuasive strategies ads use, and the capabilities that computer vision systems should have to understand these strategies. We present baseline classification results for several prediction tasks, including automatically answering questions about the messages of the ads.Comment: To appear in CVPR 2017; data available on http://cs.pitt.edu/~kovashka/ad

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    Informedia at TRECVID 2003: Analyzing and searching broadcast news video

    Get PDF
    We submitted a number of semantic classifiers, most of which were merely trained on keyframes. We also experimented with runs of classifiers were trained exclusively on text data and relative time within the video, while a few were trained using all available multiple modalities. 1.2 Interactive search This year, we submitted two runs using different versions of the Informedia systems. In one run, a version identical to last year's interactive system was used by five researchers, who split up the topics between themselves. The system interface emphasizes text queries, allowing search across ASR, closed captions and OCR text. The result set can then be manipulated through: • storyboards of images spanning across video story segments • emphasizing matching shots to a user’s query to reduce the image count to a manageable size • resolution and layout under user control • additional filtering provided through shot classifiers such as outdoors, and shots with people, etc. • display of filter count and distribution to guide their use in manipulating storyboard views. In the best-performing interactive run, for all topics a single researcher used an improved version of the system, which allowed more effective browsing and visualization of the results of text queries using

    10371 Abstracts Collection -- Dynamic Maps

    Get PDF
    From September 12th to 17th, 2010, the Dagstuhl Seminar 10371 ``Dynamic Maps \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Findings of Factify 2: Multimodal Fake News Detection

    Full text link
    With social media usage growing exponentially in the past few years, fake news has also become extremely prevalent. The detrimental impact of fake news emphasizes the need for research focused on automating the detection of false information and verifying its accuracy. In this work, we present the outcome of the Factify 2 shared task, which provides a multi-modal fact verification and satire news dataset, as part of the DeFactify 2 workshop at AAAI'23. The data calls for a comparison based approach to the task by pairing social media claims with supporting documents, with both text and image, divided into 5 classes based on multi-modal relations. In the second iteration of this task we had over 60 participants and 9 final test-set submissions. The best performances came from the use of DeBERTa for text and Swinv2 and CLIP for image. The highest F1 score averaged for all five classes was 81.82%.Comment: Defactify2 @AAAI 202

    Recent Developments in Video Surveillance

    Get PDF
    With surveillance cameras installed everywhere and continuously streaming thousands of hours of video, how can that huge amount of data be analyzed or even be useful? Is it possible to search those countless hours of videos for subjects or events of interest? Shouldn’t the presence of a car stopped at a railroad crossing trigger an alarm system to prevent a potential accident? In the chapters selected for this book, experts in video surveillance provide answers to these questions and other interesting problems, skillfully blending research experience with practical real life applications. Academic researchers will find a reliable compilation of relevant literature in addition to pointers to current advances in the field. Industry practitioners will find useful hints about state-of-the-art applications. The book also provides directions for open problems where further advances can be pursued
    • …
    corecore