Search CORE

82 research outputs found

Automatic discovery of image families: Global vs. local features

Author: Aly Mohamed
Munich Mario
Perona Pietro
Welinder Peter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2009
Field of study

Gathering a large collection of images has been made quite easy by social and image sharing websites, e.g. flickr.com. However, using such collections faces the problem that they contain a large number of duplicates and highly similar images. This work tackles the problem of how to automatically organize image collections into sets of similar images, called image families hereinafter. We thoroughly compare the performance of two approaches to measure image similarity: global descriptors vs. a set of local descriptors. We assess the performance of these approaches as the problem scales up to thousands of images and hundreds of families. We present our results on a new dataset of CD/DVD game covers

Crossref

Caltech Authors

The Caltech-UCSD Birds-200-2011 Dataset

Author: Belongie Serge
Branson Steve
Perona Pietro
Wah Catherine
Welinder Peter
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2011
Field of study

CUB-200-2011 is an extended version of CUB-200 [7], a challenging dataset of 200 bird species. The extended version roughly doubles the number of images per category and adds new part localization annotations. All images are annotated with bounding boxes, part locations, and at- tribute labels. Images and annotations were filtered by mul- tiple users of Mechanical Turk. We introduce benchmarks and baseline experiments for multi-class categorization and part localization

CiteSeerX

Caltech Authors

Asymmetric Actor Critic for Image-Based Robot Learning

Author: Abbeel Pieter
Andrychowicz Marcin
Pinto Lerrel
Welinder Peter
Zaremba Wojciech
Publication venue
Publication date: 17/10/2017
Field of study

Deep reinforcement learning (RL) has proven a powerful technique in many sequential decision making domains. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator. While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations (RGBD images). We do this by employing an actor-critic training algorithm in which the critic is trained on full states while the actor (or policy) gets rendered images as input. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block. We achieve this simulation to real world transfer without training on any real world data.Comment: Videos of experiments can be found at http://www.goo.gl/b57WT

arXiv.org e-Print Archive

Crossref

A Lazy Man’s Approach to Benchmarking: Semisupervised Classifier Evaluation and Recalibration

Author: Perona Pietro
Welinder Peter
Welling Max
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

How many labeled examples are needed to estimate a classifier’s performance on a new dataset? We study the case where data is plentiful, but labels are expensive. We show that by making a few reasonable assumptions on the structure of the data, it is possible to estimate performance curves, with confidence bounds, using a small number of ground truth labels. Our approach, which we call Semisupervised Performance Evaluation (SPE), is based on a generative model for the classifier’s confidence scores. In addition to estimating the performance of classifiers on new datasets, SPE can be used to recalibrate a classifier by reestimating the class-conditional confidence distributions

International Migration, Integration and Social Cohesion online publications

Cascaded Pose Regression

Author: Dollár Piotr
Perona Pietro
Welinder Peter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

We present a fast and accurate algorithm for computing the 2D pose of objects in images called cascaded pose regression (CPR). CPR progressively refines a loosely specified initial guess, where each refinement is carried out by a different regressor. Each regressor performs simple image measurements that are dependent on the output of the previous regressors; the entire system is automatically learned from human annotated training examples. CPR is not restricted to rigid transformations: ‘pose’ is any parameterized variation of the object’s appearance such as the degrees of freedom of deformable and articulated objects. We compare CPR against both standard regression techniques and human performance (computed from redundant human annotations). Experiments on three diverse datasets (mice, faces, fish) suggest CPR is fast (2-3ms per pose estimate), accurate (approaching human performance), and easy to train from small amounts of labeled data

Crossref

Caltech Authors

Caltech-UCSD Birds 200

Author: Belongie Serge
Branson Steve
Mita Takeshi
Perona Pietro
Schroff Florian
Wah Catherine
Welinder Peter
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2010
Field of study

Caltech-UCSD Birds 200 (CUB-200) is a challenging image dataset annotated with 200 bird species. It was created to enable the study of subordinate categorization, which is not possible with other popular datasets that focus on basic level categories (such as PASCAL VOC, Caltech-101, etc). The images were downloaded from the website Flickr and filtered by workers on Amazon Mechanical Turk. Each image is annotated with a bounding box, a rough bird segmentation, and a set of attribute labels

CiteSeerX

Caltech Authors

Crowdclustering

Author: Gomes Ryan
Krause Andreas
Perona Pietro
Welinder Peter
Publication venue: Neural Information Processing Systems
Publication date: 01/01/2012
Field of study

Is it possible to crowdsource categorization? Amongst the challenges: (a) each worker has only a partial view of data, (b) different workers may have different clustering criteria and may produce different numbers of categories, (c) the underlying category structure may be hierarchical. We propose a Bayesian model of how workers may approach clustering and show how one may infer clusters/categories, as well as worker parameters, using this model. Our experiments, carried out on large collections of images, suggest that Bayesian crowdclustering works well and may be superior to single-expert annotations

Caltech Authors