17,771 research outputs found
Describing Common Human Visual Actions in Images
Which common human actions and interactions are recognizable in monocular
still images? Which involve objects and/or other people? How many is a person
performing at a time? We address these questions by exploring the actions and
interactions that are detectable in the images of the MS COCO dataset. We make
two main contributions. First, a list of 140 common `visual actions', obtained
by analyzing the largest on-line verb lexicon currently available for English
(VerbNet) and human sentences used to describe images in MS COCO. Second, a
complete set of annotations for those `visual actions', composed of
subject-object and associated verb, which we call COCO-a (a for `actions').
COCO-a is larger than existing action datasets in terms of number of actions
and instances of these actions, and is unique because it is data-driven, rather
than experimenter-biased. Other unique features are that it is exhaustive, and
that all subjects and objects are localized. A statistical analysis of the
accuracy of our annotations and of each action, interaction and subject-object
combination is provided
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
On Graduated Optimization for Stochastic Non-Convex Problems
The graduated optimization approach, also known as the continuation method,
is a popular heuristic to solving non-convex problems that has received renewed
interest over the last decade. Despite its popularity, very little is known in
terms of theoretical convergence analysis. In this paper we describe a new
first-order algorithm based on graduated optimiza- tion and analyze its
performance. We characterize a parameterized family of non- convex functions
for which this algorithm provably converges to a global optimum. In particular,
we prove that the algorithm converges to an {\epsilon}-approximate solution
within O(1/\epsilon^2) gradient-based steps. We extend our algorithm and
analysis to the setting of stochastic non-convex optimization with noisy
gradient feedback, attaining the same convergence rate. Additionally, we
discuss the setting of zero-order optimization, and devise a a variant of our
algorithm which converges at rate of O(d^2/\epsilon^4).Comment: 17 page
Unbiased Comparative Evaluation of Ranking Functions
Eliciting relevance judgments for ranking evaluation is labor-intensive and
costly, motivating careful selection of which documents to judge. Unlike
traditional approaches that make this selection deterministically,
probabilistic sampling has shown intriguing promise since it enables the design
of estimators that are provably unbiased even when reusing data with missing
judgments. In this paper, we first unify and extend these sampling approaches
by viewing the evaluation problem as a Monte Carlo estimation task that applies
to a large number of common IR metrics. Drawing on the theoretical clarity that
this view offers, we tackle three practical evaluation scenarios: comparing two
systems, comparing systems against a baseline, and ranking systems. For
each scenario, we derive an estimator and a variance-optimizing sampling
distribution while retaining the strengths of sampling-based evaluation,
including unbiasedness, reusability despite missing data, and ease of use in
practice. In addition to the theoretical contribution, we empirically evaluate
our methods against previously used sampling heuristics and find that they
generally cut the number of required relevance judgments at least in half.Comment: Under review; 10 page
Fine-Grained Product Class Recognition for Assisted Shopping
Assistive solutions for a better shopping experience can improve the quality
of life of people, in particular also of visually impaired shoppers. We present
a system that visually recognizes the fine-grained product classes of items on
a shopping list, in shelves images taken with a smartphone in a grocery store.
Our system consists of three components: (a) We automatically recognize useful
text on product packaging, e.g., product name and brand, and build a mapping of
words to product classes based on the large-scale GroceryProducts dataset. When
the user populates the shopping list, we automatically infer the product class
of each entered word. (b) We perform fine-grained product class recognition
when the user is facing a shelf. We discover discriminative patches on product
packaging to differentiate between visually similar product classes and to
increase the robustness against continuous changes in product design. (c) We
continuously improve the recognition accuracy through active learning. Our
experiments show the robustness of the proposed method against cross-domain
challenges, and the scalability to an increasing number of products with
minimal re-training.Comment: Accepted at ICCV Workshop on Assistive Computer Vision and Robotics
(ICCV-ACVR) 201
INSTRUMENTATION-BASED MUSIC SIMILARITY USING SPARSE REPRESENTATIONS
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Iterative annotation to ease neural network training: Specialized machine learning in medical image analysis
Neural networks promise to bring robust, quantitative analysis to medical
fields, but adoption is limited by the technicalities of training these
networks. To address this translation gap between medical researchers and
neural networks in the field of pathology, we have created an intuitive
interface which utilizes the commonly used whole slide image (WSI) viewer,
Aperio ImageScope (Leica Biosystems Imaging, Inc.), for the annotation and
display of neural network predictions on WSIs. Leveraging this, we propose the
use of a human-in-the-loop strategy to reduce the burden of WSI annotation. We
track network performance improvements as a function of iteration and quantify
the use of this pipeline for the segmentation of renal histologic findings on
WSIs. More specifically, we present network performance when applied to
segmentation of renal micro compartments, and demonstrate multi-class
segmentation in human and mouse renal tissue slides. Finally, to show the
adaptability of this technique to other medical imaging fields, we demonstrate
its ability to iteratively segment human prostate glands from radiology imaging
data.Comment: 15 pages, 7 figures, 2 supplemental figures (on the last page
- …