11,473 research outputs found
Fidelity-Weighted Learning
Training deep neural networks requires many training samples, but in practice
training labels are expensive to obtain and may be of varying quality, as some
may be from trusted expert labelers while others might be from heuristics or
other sources of weak supervision such as crowd-sourcing. This creates a
fundamental quality versus-quantity trade-off in the learning process. Do we
learn from the small amount of high-quality data or the potentially large
amount of weakly-labeled data? We argue that if the learner could somehow know
and take the label-quality into account when learning the data representation,
we could get the best of both worlds. To this end, we propose
"fidelity-weighted learning" (FWL), a semi-supervised student-teacher approach
for training deep neural networks using weakly-labeled data. FWL modulates the
parameter updates to a student network (trained on the task we care about) on a
per-sample basis according to the posterior confidence of its label-quality
estimated by a teacher (who has access to the high-quality labels). Both
student and teacher are learned from the data. We evaluate FWL on two tasks in
information retrieval and natural language processing where we outperform
state-of-the-art alternative semi-supervised methods, indicating that our
approach makes better use of strong and weak labels, and leads to better
task-dependent data representations.Comment: Published as a conference paper at ICLR 201
Taking the bite out of automated naming of characters in TV video
We investigate the problem of automatically labelling appearances of characters in TV or film material
with their names. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precision can be achieved by combining multiple sources of information, both visual and textual. The principal novelties that we introduce are: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying
when characters are speaking. In addition, we incorporate complementary cues of face matching and clothing matching to propose common annotations for face tracks, and consider choices of classifier which can potentially correct errors made in the automatic extraction of training data from the weak textual annotation. Results are presented on episodes of the TV series ‘‘Buffy the Vampire Slayer”
Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning
Object category localization is a challenging problem in computer vision.
Standard supervised training requires bounding box annotations of object
instances. This time-consuming annotation process is sidestepped in weakly
supervised learning. In this case, the supervised information is restricted to
binary labels that indicate the absence/presence of object instances in the
image, without their locations. We follow a multiple-instance learning approach
that iteratively trains the detector and infers the object locations in the
positive training images. Our main contribution is a multi-fold multiple
instance learning procedure, which prevents training from prematurely locking
onto erroneous object locations. This procedure is particularly important when
using high-dimensional representations, such as Fisher vectors and
convolutional neural network features. We also propose a window refinement
method, which improves the localization accuracy by incorporating an objectness
prior. We present a detailed experimental evaluation using the PASCAL VOC 2007
dataset, which verifies the effectiveness of our approach.Comment: To appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence (TPAMI
A combined sensitivity analysis and kriging surrogate modeling for early validation of health indicators
To increase the dependability of complex systems, one solution is to assess their state of health continuously through the monitoring of variables sensitive to potential degradation modes. When computed in an operating environment, these variables, known as health indicators, are subject to many uncertainties. Hence, the stochastic nature of health assessment combined with the lack of data in design stages makes it difficult to evaluate the efficiency of a health indicator before the system enters into service. This paper introduces a method for early validation of health indicators during the design stages of a system development process. This method uses physics-based modeling and uncertainties propagation to create simulated stochastic data. However, because of the large number of parameters defining the model and its computation duration, the necessary runtime for uncertainties propagation is prohibitive. Thus, kriging is used to obtain low computation time estimations of the model outputs. Moreover, sensitivity analysis techniques are performed upstream to determine the hierarchization of the model parameters and to reduce the dimension of the input space. The validation is based on three types of numerical key performance indicators corresponding to the detection, identification and prognostic processes. After having introduced and formalized the framework of uncertain systems modeling and the different performance metrics, the issues of sensitivity analysis and surrogate modeling are addressed. The method is subsequently applied to the validation of a set of health indicators for the monitoring of an aircraft engine's pumping unit
From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script
The goal of this paper is the automatic identification of characters in TV
and feature film material. In contrast to standard approaches to this task,
which rely on the weak supervision afforded by transcripts and subtitles, we
propose a new method requiring only a cast list. This list is used to obtain
images of actors from freely available sources on the web, providing a form of
partial supervision for this task. In using images of actors to recognize
characters, we make the following three contributions: (i) We demonstrate that
an automated semi-supervised learning approach is able to adapt from the
actor's face to the character's face, including the face context of the hair;
(ii) By building voice models for every character, we provide a bridge between
frontal faces (for which there is plenty of actor-level supervision) and
profile (for which there is very little or none); and (iii) by combining face
context and speaker identification, we are able to identify characters with
partially occluded faces and extreme facial poses. Results are presented on the
TV series 'Sherlock' and the feature film 'Casablanca'. We achieve the
state-of-the-art on the Casablanca benchmark, surpassing previous methods that
have used the stronger supervision available from transcripts
- …