153,786 research outputs found
Recognition self-awareness for active object recognition on depth images
We propose an active object recognition framework that introduces the recognition self-awareness, which is an intermediate level of reasoning to decide which views to cover during the object exploration. This is built first by learning a multi-view deep 3D object classifier; subsequently, a 3D dense saliency volume is generated by fusing together single-view visualization maps, these latter obtained by computing the gradient map of the class label on different image planes. The saliency volume indicates which object parts the classifier considers more important for deciding a class. Finally, the volume is injected in the observation model of a Partially Observable Markov Decision Process (POMDP). In practice, the robot decides which views to cover, depending on the expected ability of the classifier to discriminate an object class by observing a specific part. For example, the robot will look for the engine to discriminate between a bicycle and a motorbike, since the classifier has found that part as highly discriminative. Experiments are carried out on depth images with both simulated and real data, showing that our framework predicts the object class with higher accuracy and lower energy consumption than a set of alternatives
Children, Humanoid Robots and Caregivers
This paper presents developmental learning on a humanoid robot from human-robot interactions. We consider in particular teaching humanoids as children during the child's Separation and Individuation developmental phase (Mahler, 1979). Cognitive development during this phase is characterized both by the child's dependence on her mother for learning while becoming awareness of her own individuality, and by self-exploration of her physical surroundings. We propose a learning framework for a humanoid robot inspired on such cognitive development
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149â164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
Vision, Action, and Make-Perceive
In this paper, I critically assess the enactive account of visual perception recently defended by Alva NoĂ« (2004). I argue inter alia that the enactive account falsely identifies an objectâs apparent shape with its 2D perspectival shape; that it mistakenly assimilates visual shape perception and volumetric object recognition; and that it seriously misrepresents the constitutive role of bodily action in visual awareness. I argue further that noticing an objectâs perspectival shape involves a hybrid experience combining both perceptual and imaginative elements â an act of what I call âmake-perceive.
The multisensory body revealed through its cast shadows
One key issue when conceiving the body as a multisensory object is how the cognitive
system integrates visible instances of the self and other bodies with one\u2019s own
somatosensory processing, to achieve self-recognition and body ownership. Recent
research has strongly suggested that shadows cast by our own body have a special
status for cognitive processing, directing attention to the body in a fast and highly specific
manner. The aim of the present article is to review the most recent scientific contributions
addressing how body shadows affect both sensory/perceptual and attentional processes.
The review examines three main points: (1) body shadows as a special window to
investigate the construction of multisensory body perception; (2) experimental paradigms
and related findings; (3) open questions and future trajectories. The reviewed literature
suggests that shadows cast by one\u2019s own body promote binding between personal
and extrapersonal space and elicit automatic orienting of attention toward the bodypart
casting the shadow. Future research should address whether the effects exerted
by body shadows are similar to those observed when observers are exposed to other
visual instances of their body. The results will further clarify the processes underlying the
merging of vision and somatosensation when creating body representations
EDUCATION AS MYTHIC IMAGE
Mythopoetry, the imagistic voice of the muses which manifests in myth and natural poetry, has been invoked as an impression of ideal curriculum with which to cherish intimate, vital experience (and to oppose its exile from educational life). In this statement, I intend to see through the pleasant surface of the label, mythopoetry, to see what image may lie just out of sight, beyond the "inspired writing" that mythopoetry implies. Beyond words themselves, meaning is found in sound and in expressive representation. âMusic, when soft voices die, / Vibrates in the memoryâ (Shelley
Geometry meets semantics for semi-supervised monocular depth estimation
Depth estimation from a single image represents a very exciting challenge in
computer vision. While other image-based depth sensing techniques leverage on
the geometry between different viewpoints (e.g., stereo or structure from
motion), the lack of these cues within a single image renders ill-posed the
monocular depth estimation task. For inference, state-of-the-art
encoder-decoder architectures for monocular depth estimation rely on effective
feature representations learned at training time. For unsupervised training of
these models, geometry has been effectively exploited by suitable images
warping losses computed from views acquired by a stereo rig or a moving camera.
In this paper, we make a further step forward showing that learning semantic
information from images enables to improve effectively monocular depth
estimation as well. In particular, by leveraging on semantically labeled images
together with unsupervised signals gained by geometry through an image warping
loss, we propose a deep learning approach aimed at joint semantic segmentation
and depth estimation. Our overall learning framework is semi-supervised, as we
deploy groundtruth data only in the semantic domain. At training time, our
network learns a common feature representation for both tasks and a novel
cross-task loss function is proposed. The experimental findings show how,
jointly tackling depth prediction and semantic segmentation, allows to improve
depth estimation accuracy. In particular, on the KITTI dataset our network
outperforms state-of-the-art methods for monocular depth estimation.Comment: 16 pages, Accepted to ACCV 201
- âŠ