10 research outputs found
Recognition of 3-D Objects from Multiple 2-D Views by a Self-Organizing Neural Architecture
The recognition of 3-D objects from sequences of their 2-D views is modeled by a neural architecture, called VIEWNET that uses View Information Encoded With NETworks. VIEWNET illustrates how several types of noise and varialbility in image data can be progressively removed while incornplcte image features are restored and invariant features are discovered using an appropriately designed cascade of processing stages. VIEWNET first processes 2-D views of 3-D objects using the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and removes noise from the images. Boundary regularization and cornpletion are achieved by the same mechanisms that suppress image noise. A log-polar transform is taken with respect to the centroid of the resulting figure and then re-centered to achieve 2-D scale and rotation invariance. The invariant images are coarse coded to further reduce noise, reduce foreshortening effects, and increase generalization. These compressed codes are input into a supervised learning system based on the fuzzy ARTMAP algorithm. Recognition categories of 2-D views are learned before evidence from sequences of 2-D view categories is accumulated to improve object recognition. Recognition is studied with noisy and clean images using slow and fast learning. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of 2-D views of jet aircraft with and without additive noise. A recognition rate of 90% is achieved with one 2-D view category and of 98.5% correct with three 2-D view categories.National Science Foundation (IRI 90-24877); Office of Naval Research (N00014-91-J-1309, N00014-91-J-4100, N00014-92-J-0499); Air Force Office of Scientific Research (F9620-92-J-0499, 90-0083
Towards automated creation of image interpretation systems
Abstract. Automated image interpretation is an important task in numerous applications ranging from security systems to natural resource inventorization based on remote-sensing. Recently, a second generation of adaptive machine-learned image interpretation systems have shown expert-level performance in several challenging domains. While demonstrating an unprecedented improvement over hand-engineered and first generation machine-learned systems in terms of cross-domain portability, design-cycle time, and robustness, such systems are still severely limited. This paper inspects the anatomy of the state-of-the-art Multi resolution Adaptive Object Recognition framework (MR ADORE) and presents extensions that aim at removing the last vestiges of human intervention still present in the original design of ADORE. More specifically, feature selection is still a task performed by human domain experts and represents a major stumbling block in the creation process of fully autonomous image interpretation systems. This paper focuses on minimizing such need for human engineering. After discussing experimental results, showing the performance of the framework extensions in the domain of forestry, the paper concludes by outlining autonomous feature extraction methods that may completely remove the need for human expertise in the feature selection process
Recommended from our members
Driving saccade to pursuit using image motion
Within the context of active vision, scant attention has been paid to the execution of motion saccades—rapid re-adjustments of the direction of gaze to attend to moving objects. In this paper we first develop a methodology for, and give real-time demonstrations of, the use of motion detection and segmentation processes to initiate capture saccades towards a moving object. The saccade is driven by both position and velocity of the moving target under the assumption of constant target velocity, using prediction to overcome the delay introduced by visual processing. We next demonstrate the use of a first order approximation to the segmented motion field to compute bounds on the time-to-contact in the presence of looming motion. If the bound falls below a safe limit, a panic saccade is fired, moving the camera away from the approaching object. We then describe the use of image motion to realize smooth pursuit, tracking using velocity information alone, where the camera is moved so as to null a single constant image motion fitted within a central image region. Finally, we glue together capture saccades with smooth pursuit, thus effecting changes in both what is being attended to and how it is being attended to. To couple the different visual activities of waiting, saccading, pursuing and panicking, we use a finite state machine which provides inherent robustness outside of visual processing and provides a means of making repeated exploration. We demonstrate in repeated trials that the transition from saccadic motion to tracking is more likely to succeed using position and velocity control, than when using position alone