166,078 research outputs found
Recognition of 3-D Objects from Multiple 2-D Views by a Self-Organizing Neural Architecture
The recognition of 3-D objects from sequences of their 2-D views is modeled by a neural architecture, called VIEWNET that uses View Information Encoded With NETworks. VIEWNET illustrates how several types of noise and varialbility in image data can be progressively removed while incornplcte image features are restored and invariant features are discovered using an appropriately designed cascade of processing stages. VIEWNET first processes 2-D views of 3-D objects using the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and removes noise from the images. Boundary regularization and cornpletion are achieved by the same mechanisms that suppress image noise. A log-polar transform is taken with respect to the centroid of the resulting figure and then re-centered to achieve 2-D scale and rotation invariance. The invariant images are coarse coded to further reduce noise, reduce foreshortening effects, and increase generalization. These compressed codes are input into a supervised learning system based on the fuzzy ARTMAP algorithm. Recognition categories of 2-D views are learned before evidence from sequences of 2-D view categories is accumulated to improve object recognition. Recognition is studied with noisy and clean images using slow and fast learning. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of 2-D views of jet aircraft with and without additive noise. A recognition rate of 90% is achieved with one 2-D view category and of 98.5% correct with three 2-D view categories.National Science Foundation (IRI 90-24877); Office of Naval Research (N00014-91-J-1309, N00014-91-J-4100, N00014-92-J-0499); Air Force Office of Scientific Research (F9620-92-J-0499, 90-0083
RON: Reverse Connection with Objectness Prior Networks for Object Detection
We present RON, an efficient and effective framework for generic object
detection. Our motivation is to smartly associate the best of the region-based
(e.g., Faster R-CNN) and region-free (e.g., SSD) methodologies. Under fully
convolutional architecture, RON mainly focuses on two fundamental problems: (a)
multi-scale object localization and (b) negative sample mining. To address (a),
we design the reverse connection, which enables the network to detect objects
on multi-levels of CNNs. To deal with (b), we propose the objectness prior to
significantly reduce the searching space of objects. We optimize the reverse
connection, objectness prior and object detector jointly by a multi-task loss
function, thus RON can directly predict final detection results from all
locations of various feature maps. Extensive experiments on the challenging
PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO benchmarks demonstrate the
competitive performance of RON. Specifically, with VGG-16 and low resolution
384X384 input size, the network gets 81.3% mAP on PASCAL VOC 2007, 80.7% mAP on
PASCAL VOC 2012 datasets. Its superiority increases when datasets become larger
and more difficult, as demonstrated by the results on the MS COCO dataset. With
1.5G GPU memory at test phase, the speed of the network is 15 FPS, 3X faster
than the Faster R-CNN counterpart.Comment: Project page will be available at https://github.com/taokong/RON, and
formal paper will appear in CVPR 201
ARTSCENE: A Neural System for Natural Scene Classification
How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? The ARTSCENE neural system classifies natural scene photographs by using multiple spatial scales to efficiently accumulate evidence for gist and texture. ARTSCENE embodies a coarse-to-fine Texture Size Ranking Principle whereby spatial attention processes multiple scales of scenic information, ranging from global gist to local properties of textures. The model can incrementally learn and predict scene identity by gist information alone and can improve performance through selective attention to scenic textures of progressively smaller size. ARTSCENE discriminates 4 landscape scene categories (coast, forest, mountain and countryside) with up to 91.58% correct on a test set, outperforms alternative models in the literature which use biologically implausible computations, and outperforms component systems that use either gist or texture information alone. Model simulations also show that adjacent textures form higher-order features that are also informative for scene recognition.National Science Foundation (NSF SBE-0354378); Office of Naval Research (N00014-01-1-0624
- …