216 research outputs found
Multi-scale cortical keypoint representation for attention and object detection
Keypoints (junctions) provide important information for
focus-of-attention (FoA) and object categorization/recognition. In this
paper we analyze the multi-scale keypoint representation, obtained by
applying a linear and quasi-continuous scaling to an optimized model of
cortical end-stopped cells, in order to study its importance and possibilities
for developing a visual, cortical architecture.We show that keypoints,
especially those which are stable over larger scale intervals, can provide
a hierarchically structured saliency map for FoA and object recognition.
In addition, the application of non-classical receptive field inhibition to
keypoint detection allows to distinguish contour keypoints from texture
(surface) keypoints
Multi-scale keypoints in V1 and beyond: object segregation, scale selection, saliency maps and face detection
End-stopped cells in cortical area V1, which combine outputs of complex cells tuned to different orientations, serve to detect line and edge crossings, singularities and points with large curvature. These cells can be used to construct retinotopic keypoint maps at different spatial scales (level-of-detail). The importance of the multi-scale keypoint representation is studied in this paper. It is shown that this representation provides very important information for object recognition and face detection. Different grouping operators can be used for object segregation and automatic scale selection. Saliency maps for focus-of-attention can be constructed. Such maps can be employed for face detection by grouping facial landmarks at eyes, nose and mouth. Although a face detector can be based on processing within area V1, it is argued that such an operator must be embedded into dorsal and ventral data streams, to and from higher cortical areas, for obtaining translation-, rotation- and scale-invariant detection
Face normalization using multi-scale cortical keypoints
Empirical studies concerning face recognition suggest
that faces may be stored in memory by a few canonical representations.
Models of visual perception are based on image
representations in cortical area V1 and beyond, which
contain many cell layers for feature extractions. Simple,
complex and end-stopped cells tuned to different spatial frequencies
(scales) and/or orientations provide input for line,
edge and keypoint detection. This yields a rich, multi-scale
object representation that can be stored in memory in order
to identify objects. The multi-scale, keypoint-based saliency
maps for Focus-of-Attention can be explored to obtain face
detection and normalization, after which face recognition
can be achieved using the line/edge representation. In this
paper, we focus only on face normalization, showing that
multi-scale keypoints can be used to construct canonical
representations of faces in memory
Face recognition by cortical multi-scale line and edge representations
Empirical studies concerning face recognition suggest that
faces may be stored in memory by a few canonical representations. Models
of visual perception are based on image representations in cortical
area V1 and beyond, which contain many cell layers for feature extraction.
Simple, complex and end-stopped cells provide input for line, edge
and keypoint detection. Detected events provide a rich, multi-scale object
representation, and this representation can be stored in memory in
order to identify objects. In this paper, the above context is applied to
face recognition. The multi-scale line/edge representation is explored in
conjunction with keypoint-based saliency maps for Focus-of-Attention.
Recognition rates of up to 96% were achieved by combining frontal and
3/4 views, and recognition was quite robust against partial occlusions
Recognition of facial expressions by cortical multi-scale line and edge coding
Empirical studies concerning face recognition suggest that faces may be stored in memory by a few canonical representations. Models of visual perception are based on image representations in cortical area V1 and beyond, which contain many cell layers for feature extraction.
Simple, complex and end-stopped cells provide input for line, edge and keypoint detection. Detected events provide a rich, multi-scale object representation, and this representation can be stored in memory in
order to identify objects. In this paper, the above context is applied to face recognition. The multi-scale line/edge representation is explored in conjunction with keypoint-based saliency maps for Focus-of-Attention.
Recognition rates of up to 96% were achieved by combining frontal and 3/4 views, and recognition was quite robust against partial occlusions
Multi-scale keypoints in V1 and face detection
End-stopped cells in cortical area V1, which combine out-
puts of complex cells tuned to different orientations, serve to detect line
and edge crossings (junctions) and points with a large curvature. In this
paper we study the importance of the multi-scale keypoint representa-
tion, i.e. retinotopic keypoint maps which are tuned to different spatial
frequencies (scale or Level-of-Detail). We show that this representation
provides important information for Focus-of-Attention (FoA) and object
detection. In particular, we show that hierarchically-structured saliency
maps for FoA can be obtained, and that combinations over scales in
conjunction with spatial symmetries can lead to face detection through
grouping operators that deal with keypoints at the eyes, nose and mouth,
especially when non-classical receptive field inhibition is employed. Al-
though a face detector can be based on feedforward and feedback loops
within area V1, such an operator must be embedded into dorsal and
ventral data streams to and from higher areas for obtaining translation-,
rotation- and scale-invariant face (object) detection
Multi-scale cortical keypoints for realtime hand tracking and gesture recognition
Human-robot interaction is an interdisciplinary
research area which aims at integrating human factors, cognitive
psychology and robot technology. The ultimate goal is
the development of social robots. These robots are expected to
work in human environments, and to understand behavior of
persons through gestures and body movements. In this paper
we present a biological and realtime framework for detecting
and tracking hands. This framework is based on keypoints
extracted from cortical V1 end-stopped cells. Detected keypoints
and the cells’ responses are used to classify the junction type.
By combining annotated keypoints in a hierarchical, multi-scale
tree structure, moving and deformable hands can be segregated,
their movements can be obtained, and they can be tracked over
time. By using hand templates with keypoints at only two scales,
a hand’s gestures can be recognized
A cortical framework for invariant object categorization and recognition
In this paper we present a new model for invariant object categorization and recognition. It is based on explicit
multi-scale features: lines, edges and keypoints are extracted from responses of simple, complex and endstopped cells in cortical area V1, and keypoints are used to construct saliency maps for Focus-of-Attention.
The model is a functional but dichotomous one, because keypoints are employed to model the “where” data stream, with dynamic routing of features from V1 to higher areas to obtain translation, rotation and size
invariance, whereas lines and edges are employed in the “what” stream for object categorization and recognition. Furthermore, both the “where” and “what” pathways are dynamic in that information at coarse
scales is employed first, after which information at progressively finer scales is added in order to refine the processes, i.e., both the dynamic feature routing and the categorization level. The construction of group and object templates, which are thought to be available in the prefrontal cortex with “what” and “where” components in PF46d and PF46v, is also illustrated. The model was tested in the framework of an integrated and biologically plausible architecture
Artistic rendering of the visual cortex
In this paper we explain the processing in the
first layers of the visual cortex by simple, complex and endstopped
cells, plus grouping cells for line, edge, keypoint and
saliency detection. Three visualisations are presented: (a) an
integrated scheme that shows activities of simple, complex
and end-stopped cells, (b) artistic combinations of selected
activity maps that give an impression of global image
structure and/or local detail, and (c) NPR on the basis of a
2D brightness model. The cortical image representations
offer many possibilities for non-photorealistic rendering
- …