Search CORE

905 research outputs found

Detecting and Naming Actors in Movies using Generative Appearance Models

Author: Gandhi Vineet
Ronfard Rémi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

International audienceWe introduce a generative model for learning person and costume specific detectors from labeled examples. We demonstrate the model on the task of localizing and naming actors in long video sequences. More specifically, the actor's head and shoulders are each represented as a constellation of optional color regions. Detection can proceed despite changes in view-point and partial occlusions. We explain how to learn the models from a small number of labeled keyframes or video tracks, and how to detect novel appearances of the actors in a maximum likelihood framework. We present results on a challenging movie example, with 81% recall in actor detection (coverage) and 89% precision in actor identification (naming).Nous présentons un modèle génératif pemettant l'apprentissage de détecteurs de personnes et de costumes à partir d'exemples. Nous appliquons notre modèle au problème de la détection, de la localisation et de l'identification d'acteurs dans de longues séquences vidéo. Nous représentons la tete et les épaules de chaque acteur comme une constellation de régions de couleurs. Toutes les régions sont facultatives, ce qui nous permet de rendre la méthode robuste aux changements de points de vues et aux occultations partielles. Nous décrivons comment le modèle peut être appris à partir d'un petit nombre d'exemple, et décrivons un algorithm rapide de détection. Notre méthode permet de detecter et reconnaitre les 8 acteurs du film "La corde" d'Alfred Hitchcock dans 81 % des cas, avec une précision de 89 %

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

A Computational Framework for Vertical Video Editing

Author: Gandhi Vineet
Ronfard Rémi
Publication venue: Eurographics Association
Publication date: 04/05/2015
Field of study

International audienceVertical video editing is the process of digitally editing the image within the frame as opposed to horizontal video editing, which arranges the shots along a timeline. Vertical editing can be a time-consuming and error-prone process when using manual key-framing and simple interpolation. In this paper, we present a general framework for automatically computing a variety of cinematically plausible shots from a single input video suitable to the special case of live performances. Drawing on working practices in traditional cinematography, the system acts as a virtual camera assistant to the film editor, who can call novel shots in the edit room with a combination of high-level instructions and manually selected keyframes

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Taking the bite out of automated naming of characters in TV video

Author: Everingham M.
Sivic J.
Zisserman A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We investigate the problem of automatically labelling appearances of characters in TV or film material with their names. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precision can be achieved by combining multiple sources of information, both visual and textual. The principal novelties that we introduce are: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying when characters are speaking. In addition, we incorporate complementary cues of face matching and clothing matching to propose common annotations for face tracks, and consider choices of classifier which can potentially correct errors made in the automatic extraction of training data from the weak textual annotation. Results are presented on episodes of the TV series ‘‘Buffy the Vampire Slayer”

Oxford University Research Archive

White Rose Research Online

Hiding in Plain Sight: A Longitudinal Study of Combosquatting Abuse

Author: Alex Gontmakher
Artem Dinaburg
Chris Kanich
Fidelis Threat Research Team
FireEye
Jakobsson Markus
Jakobsson Markus
Kreibich Christian
Lee
Leyla Bilge Engin Kirda
Manos Antonakakis Roberto Perdisci
Manos Antonakakis Roberto Perdisci
Manos Antonakakis Roberto Perdisci
Marczak William R
Monrose
Rahbarinia R. Perdisci
Rodney Joffe
Snyder Peter
Symantec
TECHNOLOGIES.
Wang Yi-Min
Weimer
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 28/08/2017
Field of study

Domain squatting is a common adversarial practice where attackers register domain names that are purposefully similar to popular domains. In this work, we study a specific type of domain squatting called "combosquatting," in which attackers register domains that combine a popular trademark with one or more phrases (e.g., betterfacebook[.]com, youtube-live[.]com). We perform the first large-scale, empirical study of combosquatting by analyzing more than 468 billion DNS records---collected from passive and active DNS data sources over almost six years. We find that almost 60% of abusive combosquatting domains live for more than 1,000 days, and even worse, we observe increased activity associated with combosquatting year over year. Moreover, we show that combosquatting is used to perform a spectrum of different types of abuse including phishing, social engineering, affiliate abuse, trademark abuse, and even advanced persistent threats. Our results suggest that combosquatting is a real problem that requires increased scrutiny by the security community.Comment: ACM CCS 1

arXiv.org e-Print Archive

Crossref

Person Recognition in Personal Photo Collections

Author: Benenson Rodrigo
Fritz Mario
Oh Seong Joon
Schiele Bernt
Publication venue
Publication date: 01/01/2015
Field of study

Recognising persons in everyday photos presents major challenges (occluded faces, different clothing, locations, etc.) for machine vision. We propose a convnet based person recognition system on which we provide an in-depth analysis of informativeness of different body cues, impact of training data, and the common failure modes of the system. In addition, we discuss the limitations of existing benchmarks and propose more challenging ones. Our method is simple and is built on open source and open data, yet it improves the state of the art results on a large dataset of social media photos (PIPA).Comment: Accepted to ICCV 2015, revise

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

Crossref

MPG.PuRe

Video Indexing Using Face Appearance and Shot Transition Detection

Author: Carcagni Pierluigi
Cazzato Dario
Distante Cosimo
Leo Marco
Lorenzo-Navarro Javier
Voos Holger
Publication venue
Publication date: 01/01/2019
Field of study

Open Repository and Bibliography - Luxembourg

Finding Actors and Actions in Movies

Author: Bach Francis
Bojanowski Piotr
Laptev Ivan
Ponce Jean
Schmid Cordelia
Sivic Josef
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2013
Field of study

International audienceWe address the problem of learning a joint model of actors and actions in movies using weak supervision provided by scripts. Specifically, we extract actor/action pairs from the script and use them as constraints in a discriminative clustering framework. The corresponding optimization problem is formulated as a quadratic program under linear constraints. People in video are represented by automatically extracted and tracked faces together with corresponding motion features. First, we apply the proposed framework to the task of learning names of characters in the movie and demonstrate significant improvements over previous methods used for this task. Second, we explore the joint actor/action constraint and show its advantage for weakly supervised action learning. We validate our method in the challenging setting of localizing and recognizing characters and their actions in feature length movies Casablanca and American Beauty

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Detecting and Grounding Important Characters in Visual Stories

Author: Keller Frank
Liu Danyang
Publication venue
Publication date: 30/03/2023
Field of study

Characters are essential to the plot of any story. Establishing the characters before writing a story can improve the clarity of the plot and the overall flow of the narrative. However, previous work on visual storytelling tends to focus on detecting objects in images and discovering relationships between them. In this approach, characters are not distinguished from other objects when they are fed into the generation pipeline. The result is a coherent sequence of events rather than a character-centric story. In order to address this limitation, we introduce the VIST-Character dataset, which provides rich character-centric annotations, including visual and textual co-reference chains and importance ratings for characters. Based on this dataset, we propose two new tasks: important character detection and character grounding in visual stories. For both tasks, we develop simple, unsupervised models based on distributional similarity and pre-trained vision-and-language models. Our new dataset, together with these models, can serve as the foundation for subsequent work on analysing and generating stories from a character-centric perspective.Comment: AAAI 202

arXiv.org e-Print Archive

Detecting People Looking at Each Other in Videos

Author: Eichner M.
Ferrari Vittorio
Marin-Jimenez M.
Zisserman Andrew
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The objective of this work is to determine if people are interacting in TV video by detecting whether they are looking at each other or not. We determine both the temporal period of the interaction and also spatially localize the relevant people. We make the following four contributions: (i) head detection with implicit coarse pose information (front, profile, back); (ii) continuous head pose estimation in unconstrained scenarios (TV video) using Gaussian process regression; (iii) propose and evaluate several methods for assessing whether and when pairs of people are looking at each other in a video shot; and (iv) introduce new ground truth annotation for this task, extending the TV human interactions dataset (Patron-Perez et al. 2010) The performance of the methods is evaluated on this dataset, which consists of 300 video clips extracted from TV shows. Despite the variety and difficulty of this video material, our best method obtains an average precision of 87.6 % in a fully automatic manner

Repositorio Institucional de la Universidad de Córdoba

Edinburgh Research Explorer