7,531 research outputs found
Interactively Test Driving an Object Detector: Estimating Performance on Unlabeled Data
In this paper, we study the problem of `test-driving' a detector, i.e.
allowing a human user to get a quick sense of how well the detector generalizes
to their specific requirement. To this end, we present the first system that
estimates detector performance interactively without extensive ground truthing
using a human in the loop. We approach this as a problem of estimating
proportions and show that it is possible to make accurate inferences on the
proportion of classes or groups within a large data collection by observing
only of samples from the data. In estimating the false detections (for
precision), the samples are chosen carefully such that the overall
characteristics of the data collection are preserved. Next, inspired by its use
in estimating disease propagation we apply pooled testing approaches to
estimate missed detections (for recall) from the dataset. The estimates thus
obtained are close to the ones obtained using ground truth, thus reducing the
need for extensive labeling which is expensive and time consuming.Comment: Published at Winter Conference on Applications of Computer Vision,
201
Indirect Match Highlights Detection with Deep Convolutional Neural Networks
Highlights in a sport video are usually referred as actions that stimulate
excitement or attract attention of the audience. A big effort is spent in
designing techniques which find automatically highlights, in order to
automatize the otherwise manual editing process. Most of the state-of-the-art
approaches try to solve the problem by training a classifier using the
information extracted on the tv-like framing of players playing on the game
pitch, learning to detect game actions which are labeled by human observers
according to their perception of highlight. Obviously, this is a long and
expensive work. In this paper, we reverse the paradigm: instead of looking at
the gameplay, inferring what could be exciting for the audience, we directly
analyze the audience behavior, which we assume is triggered by events happening
during the game. We apply deep 3D Convolutional Neural Network (3D-CNN) to
extract visual features from cropped video recordings of the supporters that
are attending the event. Outputs of the crops belonging to the same frame are
then accumulated to produce a value indicating the Highlight Likelihood (HL)
which is then used to discriminate between positive (i.e. when a highlight
occurs) and negative samples (i.e. standard play or time-outs). Experimental
results on a public dataset of ice-hockey matches demonstrate the effectiveness
of our method and promote further research in this new exciting direction.Comment: "Social Signal Processing and Beyond" workshop, in conjunction with
ICIAP 201
- …