Search CORE

25 research outputs found

UvA-DARE (Digital Academic Repository) In a hurry to work with high-speed video at school?

Author: A J P
P H M Uijlings
Publication venue
Publication date: 05/03/2020
Field of study

Detecting People in Artwork with CNNs

Author: B Xiao
Bai Xiao
D Hoiem
EJ Crowley
H. Giebel
HA Rowley
J Willats
John Canny
JR Uijlings
K Fukushima
K He
M Everingham
N Srivastava
O. Matan
P Hall
PF Felzenszwalb
Q Wu
R Hu
R Vaillant
S Ginosar
T Funkhouser
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/09/2016
Field of study

CNNs have massively improved performance in object detection in photographs. However research into object detection in artwork remains limited. We show state-of-the-art performance on a challenging dataset, People-Art, which contains people from photos, cartoons and 41 different artwork movements. We achieve this high performance by fine-tuning a CNN for this task, thus also demonstrating that training CNNs on photos results in overfitting for photos: only the first three or four layers transfer from photos to artwork. Although the CNN's performance is the highest yet, it remains less than 60\% AP, suggesting further work is needed for the cross-depiction problem. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-46604-0_57Comment: 14 pages, plus 3 pages of references; 7 figures in ECCV 2016 Workshop

arXiv.org e-Print Archive

Crossref

Explore Bristol Research

The Visual Extent of an Object

Author: A. F. Smeaton
A. Oliva
A. Oliva
A. Rabinovich
A. Singhal
A. Vedaldi
A. W. M. Smeulders
A. W. M. Smeulders
B. Fulkerson
C. M. Bishop
D. G. Lowe
D. Hoiem
E. Nowak
F. Jurie
F. Moosmann
G. Csurka
H. Harzallah
I. Biederman
J. R. R. Uijlings
J. R. R. Uijlings
J. Shotton
J. Sivic
J. Zhang
K. E. A. Sande van de
K. Mikolajczyk
L. Wolf
M. A. Tahir
M. B. Blaschko
M. Bar
M. C. Burl
M. Everingham
M. M. Ullah
M. Marszałek
N. Dalal
P. Carbonetto
P. Geurts
R. Fergus
R. J. H. Scha
S. Agarwal
S. Gould
S. K. Divvala
S. Lazebnik
S. Maji
T. Malisiewicz
T. Malisiewicz
T. Tuytelaars
V. Nedović
Y. G. Jiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Cross-Modal Supervision for Learning Active Speaker Detection in Video

Author: A Khosla
A Vedaldi
E Khoury
F Perronnin
H Bilen
JR Uijlings
M Everingham
P Pletscher
T Deselaers
T Tommasi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

© Springer International Publishing AG 2016. In this paper, we show how to use audio to supervise the learning of active speaker detection in video. Voice Activity Detection (VAD) guides the learning of the vision-based classifier in a weakly supervised manner. The classifier uses spatio-temporal features to encode upper body motion - facial expressions and gesticulations associated with speaking. We further improve a generic model for active speaker detection by learning person specific models. Finally, we demonstrate the online adaptation of generic models learnt on one dataset, to previously unseen people in a new dataset, again using audio (VAD) for weak supervision. The use of temporal continuity overcomes the lack of clean training data. We are the first to present an active speaker detection system that learns on one audio-visual dataset and automatically adapts to speakers in a new dataset. This work can be seen as an example of how the availability of multi-modal data allows us to learn a model without the need for supervision, by transferring knowledge from one modality to another.Chakravarty P., Tuytelaars T., ''Cross-modal supervision for learning active speaker detection in video'', Lecture notes in computer science, vol. 9909, pp. 285-301, 2016 (14th European conference on computer vision - ECCV 2016, October 11-14, 2016, Amsterdam, The Netherlands).status: publishe

Lirias

Crossref