35,173 research outputs found
Large-Scale Neural Systems for Vision and Cognition
— Consideration of how people respond to the question What is this? has suggested new problem frontiers for pattern recognition and information fusion, as well as neural systems that embody the cognitive transformation of declarative information into relational knowledge. In contrast to traditional classification methods, which aim to find the single correct label for each exemplar (This is a car), the new approach discovers rules that embody coherent relationships among labels which would otherwise appear contradictory to a learning system (This is a car, that is a vehicle, over there is a sedan). This talk will describe how an individual who experiences exemplars in real time, with each exemplar trained on at most one category label, can autonomously discover a hierarchy of cognitive rules, thereby converting local information into global knowledge. Computational examples are based on the observation that sensors working at different times, locations, and spatial scales, and experts with different goals, languages, and situations, may produce apparently inconsistent image labels, which are reconciled by implicit underlying relationships that the network’s learning process discovers. The ARTMAP information fusion system can, moreover, integrate multiple separate knowledge hierarchies, by fusing independent domains into a unified structure. In the process, the system discovers cross-domain rules, inferring multilevel relationships among groups of output classes, without any supervised labeling of these relationships. In order to self-organize its expert system, the ARTMAP information fusion network features distributed code representations which exploit the model’s intrinsic capacity for one-to-many learning (This is a car and a vehicle and a sedan) as well as many-to-one learning (Each of those vehicles is a car). Fusion system software, testbed datasets, and articles are available from http://cns.bu.edu/techlab.Defense Advanced Research Projects Research Agency (Hewlett-Packard Company, DARPA HR0011-09-3-0001; HRL Laboratories LLC subcontract 801881-BS under prime contract HR0011-09-C-0011); Science of Learning Centers program of the National Science Foundation (SBE-0354378
Computational role of eccentricity dependent cortical magnification
We develop a sampling extension of M-theory focused on invariance to scale
and translation. Quite surprisingly, the theory predicts an architecture of
early vision with increasing receptive field sizes and a high resolution fovea
-- in agreement with data about the cortical magnification factor, V1 and the
retina. From the slope of the inverse of the magnification factor, M-theory
predicts a cortical "fovea" in V1 in the order of by basic units at
each receptive field size -- corresponding to a foveola of size around
minutes of arc at the highest resolution, degrees at the lowest
resolution. It also predicts uniform scale invariance over a fixed range of
scales independently of eccentricity, while translation invariance should
depend linearly on spatial frequency. Bouma's law of crowding follows in the
theory as an effect of cortical area-by-cortical area pooling; the Bouma
constant is the value expected if the signature responsible for recognition in
the crowding experiments originates in V2. From a broader perspective, the
emerging picture suggests that visual recognition under natural conditions
takes place by composing information from a set of fixations, with each
fixation providing recognition from a space-scale image fragment -- that is an
image patch represented at a set of increasing sizes and decreasing
resolutions
Key-Pose Prediction in Cyclic Human Motion
In this paper we study the problem of estimating innercyclic time intervals
within repetitive motion sequences of top-class swimmers in a swimming channel.
Interval limits are given by temporal occurrences of key-poses, i.e.
distinctive postures of the body. A key-pose is defined by means of only one or
two specific features of the complete posture. It is often difficult to detect
such subtle features directly. We therefore propose the following method: Given
that we observe the swimmer from the side, we build a pictorial structure of
poselets to robustly identify random support poses within the regular motion of
a swimmer. We formulate a maximum likelihood model which predicts a key-pose
given the occurrences of multiple support poses within one stroke. The maximum
likelihood can be extended with prior knowledge about the temporal location of
a key-pose in order to improve the prediction recall. We experimentally show
that our models reliably and robustly detect key-poses with a high precision
and that their performance can be improved by extending the framework with
additional camera views.Comment: Accepted at WACV 2015, 8 pages, 3 figure
A Contrast- and Luminance-Driven Multiscale Netowrk Model of Brightness Perception
A neural network model of brightness perception is developed to account for a wide variety of data, including the classical phenomenon of Mach bands, low- and high-contrast missing fundamental, luminance staircases, and non-linear contrast effects associated with sinusoidal waveforms. The model builds upon previous work on filling-in models that produce brightness profiles through the interaction of boundary and feature signals. Boundary computations that are sensitive to luminance steps and to continuous lumi- nance gradients are presented. A new interpretation of feature signals through the explicit representation of contrast-driven and luminance-driven information is provided and directly addresses the issue of brightness "anchoring." Computer simulations illustrate the model's competencies.Air Force Office of Scientific Research (F49620-92-J-0334); Northeast Consortium for Engineering Education (NCEE-A303-21-93); Office of Naval Research (N00014-91-J-4100); German BMFT grant (413-5839-01 1N 101 C/1); CNPq and NUTES/UFRJ, Brazi
- …