636,371 research outputs found
The iNaturalist Species Classification and Detection Dataset
Existing image classification datasets used in computer vision tend to have a
uniform distribution of images across object categories. In contrast, the
natural world is heavily imbalanced, as some species are more abundant and
easier to photograph than others. To encourage further progress in challenging
real world conditions we present the iNaturalist species classification and
detection dataset, consisting of 859,000 images from over 5,000 different
species of plants and animals. It features visually similar species, captured
in a wide variety of situations, from all over the world. Images were collected
with different camera types, have varying image quality, feature a large class
imbalance, and have been verified by multiple citizen scientists. We discuss
the collection of the dataset and present extensive baseline experiments using
state-of-the-art computer vision classification and detection models. Results
show that current non-ensemble based methods achieve only 67% top one
classification accuracy, illustrating the difficulty of the dataset.
Specifically, we observe poor results for classes with small numbers of
training examples suggesting more attention is needed in low-shot learning.Comment: CVPR 201
Computing the Stereo Matching Cost with a Convolutional Neural Network
We present a method for extracting depth information from a rectified image
pair. We train a convolutional neural network to predict how well two image
patches match and use it to compute the stereo matching cost. The cost is
refined by cross-based cost aggregation and semiglobal matching, followed by a
left-right consistency check to eliminate errors in the occluded regions. Our
stereo method achieves an error rate of 2.61 % on the KITTI stereo dataset and
is currently (August 2014) the top performing method on this dataset.Comment: Conference on Computer Vision and Pattern Recognition (CVPR), June
201
Biologically Inspired Approaches to Automated Feature Extraction and Target Recognition
Ongoing research at Boston University has produced computational models of biological vision and learning that embody a growing corpus of scientific data and predictions. Vision models perform long-range grouping and figure/ground segmentation, and memory models create attentionally controlled recognition codes that intrinsically cornbine botton-up activation and top-down learned expectations. These two streams of research form the foundation of novel dynamically integrated systems for image understanding. Simulations using multispectral images illustrate road completion across occlusions in a cluttered scene and information fusion from incorrect labels that are simultaneously inconsistent and correct. The CNS Vision and Technology Labs (cns.bu.edulvisionlab and cns.bu.edu/techlab) are further integrating science and technology through analysis, testing, and development of cognitive and neural models for large-scale applications, complemented by software specification and code distribution.Air Force Office of Scientific Research (F40620-01-1-0423); National Geographic-Intelligence Agency (NMA 201-001-1-2016); National Science Foundation (SBE-0354378; BCS-0235298); Office of Naval Research (N00014-01-1-0624); National Geospatial-Intelligence Agency and the National Society of Siegfried Martens (NMA 501-03-1-2030, DGE-0221680); Department of Homeland Security graduate fellowshi
- …