438 research outputs found
Watch, read and lookup: learning to spot signs from multiple supervisors
The focus of this work is sign spotting - given a video of an isolated sign,
our task is to identify whether and where it has been signed in a continuous,
co-articulated sign language video. To achieve this sign spotting task, we
train a model using multiple types of available supervision by: (1) watching
existing sparsely labelled footage; (2) reading associated subtitles (readily
available translations of the signed content) which provide additional
weak-supervision; (3) looking up words (for which no co-articulated labelled
examples are available) in visual sign language dictionaries to enable novel
sign spotting. These three tasks are integrated into a unified learning
framework using the principles of Noise Contrastive Estimation and Multiple
Instance Learning. We validate the effectiveness of our approach on low-shot
sign spotting benchmarks. In addition, we contribute a machine-readable British
Sign Language (BSL) dictionary dataset of isolated signs, BSLDict, to
facilitate study of this task. The dataset, models and code are available at
our project page.Comment: Appears in: Asian Conference on Computer Vision 2020 (ACCV 2020) -
Oral presentation. 29 page
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Exploring Metaphorical Senses and Word Representations for Identifying Metonyms
A metonym is a word with a figurative meaning, similar to a metaphor. Because
metonyms are closely related to metaphors, we apply features that are used
successfully for metaphor recognition to the task of detecting metonyms. On the
ACL SemEval 2007 Task 8 data with gold standard metonym annotations, our system
achieved 86.45% accuracy on the location metonyms. Our code can be found on
GitHub.Comment: 9 pages, 8 pages conten
Dirichlet belief networks for topic structure learning
Recently, considerable research effort has been devoted to developing deep
architectures for topic models to learn topic structures. Although several deep
models have been proposed to learn better topic proportions of documents, how
to leverage the benefits of deep structures for learning word distributions of
topics has not yet been rigorously studied. Here we propose a new multi-layer
generative process on word distributions of topics, where each layer consists
of a set of topics and each topic is drawn from a mixture of the topics of the
layer above. As the topics in all layers can be directly interpreted by words,
the proposed model is able to discover interpretable topic hierarchies. As a
self-contained module, our model can be flexibly adapted to different kinds of
topic models to improve their modelling accuracy and interpretability.
Extensive experiments on text corpora demonstrate the advantages of the
proposed model.Comment: accepted in NIPS 201
Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation
How do computers and intelligent agents view the world around them? Feature
extraction and representation constitutes one the basic building blocks towards
answering this question. Traditionally, this has been done with carefully
engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is
no ``one size fits all'' approach that satisfies all requirements. In recent
years, the rising popularity of deep learning has resulted in a myriad of
end-to-end solutions to many computer vision problems. These approaches, while
successful, tend to lack scalability and can't easily exploit information
learned by other systems. Instead, we propose SAND features, a dedicated deep
learning solution to feature extraction capable of providing hierarchical
context information. This is achieved by employing sparse relative labels
indicating relationships of similarity/dissimilarity between image locations.
The nature of these labels results in an almost infinite set of dissimilar
examples to choose from. We demonstrate how the selection of negative examples
during training can be used to modify the feature space and vary it's
properties. To demonstrate the generality of this approach, we apply the
proposed features to a multitude of tasks, each requiring different properties.
This includes disparity estimation, semantic segmentation, self-localisation
and SLAM. In all cases, we show how incorporating SAND features results in
better or comparable results to the baseline, whilst requiring little to no
additional training. Code can be found at:
https://github.com/jspenmar/SAND_featuresComment: CVPR201
Identifying Expert Reviews in the Crowd: Linking Curated and Noisy Domains
Over the past decade, vast number of online consumer reviews have made a
significant presence on the Internet. These reviews play a vital role in consumer
awareness about the products and deeply impact the consumer's decision-making
process. On one hand, websites like Amazon, Yelp provide huge collections of crowd-
sourced reviews, which are written by consumers themselves having experience in
using that product. Many researchers argue about the credibility and bias of these
reviews. These factors, coupled with the sheer plethora of reviews for each product,
it can become tiring to form a perspective about the product. On other hand,
websites like Wirecutter, Thesweetsetup provide hand-made highly curated detailed
guides on products across various categories. Although these reviews are unbiased
expert opinions, they require vigorous reporting, interviewing, and testing by various
journalists, scientists, and researchers. Thus making them hard to scale.
Our aim is to study the possible correlations between the crowd-sourced noisy
domain reviews and the curated reviews. We take into account meta-features of re-
views, context-based textual features of reviews and word-embedding based features
of words from reviews. In addition to this, we identify “good reviews", defined as
those noisy domain reviews that align with the curated ones, and use this to propose
a general purpose, extremely streamlined recommender that can provide value to the
general public without any personalized inputs. This research will contribute significantly towards identifying unbiased crowd-sourced reviews that align with curated
reviews, across different categories of products, thereby linking the curated and noisy
domains. Our research will also contribute significantly towards understanding the
intricacies of good product reviews across different categories
- …