153,786 research outputs found

    Recognition self-awareness for active object recognition on depth images

    Get PDF
    We propose an active object recognition framework that introduces the recognition self-awareness, which is an intermediate level of reasoning to decide which views to cover during the object exploration. This is built first by learning a multi-view deep 3D object classifier; subsequently, a 3D dense saliency volume is generated by fusing together single-view visualization maps, these latter obtained by computing the gradient map of the class label on different image planes. The saliency volume indicates which object parts the classifier considers more important for deciding a class. Finally, the volume is injected in the observation model of a Partially Observable Markov Decision Process (POMDP). In practice, the robot decides which views to cover, depending on the expected ability of the classifier to discriminate an object class by observing a specific part. For example, the robot will look for the engine to discriminate between a bicycle and a motorbike, since the classifier has found that part as highly discriminative. Experiments are carried out on depth images with both simulated and real data, showing that our framework predicts the object class with higher accuracy and lower energy consumption than a set of alternatives

    Children, Humanoid Robots and Caregivers

    Get PDF
    This paper presents developmental learning on a humanoid robot from human-robot interactions. We consider in particular teaching humanoids as children during the child's Separation and Individuation developmental phase (Mahler, 1979). Cognitive development during this phase is characterized both by the child's dependence on her mother for learning while becoming awareness of her own individuality, and by self-exploration of her physical surroundings. We propose a learning framework for a humanoid robot inspired on such cognitive development

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Vision, Action, and Make-Perceive

    Get PDF
    In this paper, I critically assess the enactive account of visual perception recently defended by Alva NoĂ« (2004). I argue inter alia that the enactive account falsely identifies an object’s apparent shape with its 2D perspectival shape; that it mistakenly assimilates visual shape perception and volumetric object recognition; and that it seriously misrepresents the constitutive role of bodily action in visual awareness. I argue further that noticing an object’s perspectival shape involves a hybrid experience combining both perceptual and imaginative elements – an act of what I call ‘make-perceive.

    The multisensory body revealed through its cast shadows

    Get PDF
    One key issue when conceiving the body as a multisensory object is how the cognitive system integrates visible instances of the self and other bodies with one\u2019s own somatosensory processing, to achieve self-recognition and body ownership. Recent research has strongly suggested that shadows cast by our own body have a special status for cognitive processing, directing attention to the body in a fast and highly specific manner. The aim of the present article is to review the most recent scientific contributions addressing how body shadows affect both sensory/perceptual and attentional processes. The review examines three main points: (1) body shadows as a special window to investigate the construction of multisensory body perception; (2) experimental paradigms and related findings; (3) open questions and future trajectories. The reviewed literature suggests that shadows cast by one\u2019s own body promote binding between personal and extrapersonal space and elicit automatic orienting of attention toward the bodypart casting the shadow. Future research should address whether the effects exerted by body shadows are similar to those observed when observers are exposed to other visual instances of their body. The results will further clarify the processes underlying the merging of vision and somatosensation when creating body representations

    EDUCATION AS MYTHIC IMAGE

    Get PDF
    Mythopoetry, the imagistic voice of the muses which manifests in myth and natural poetry, has been invoked as an impression of ideal curriculum with which to cherish intimate, vital experience (and to oppose its exile from educational life). In this statement, I intend to see through the pleasant surface of the label, mythopoetry, to see what image may lie just out of sight, beyond the "inspired writing" that mythopoetry implies. Beyond words themselves, meaning is found in sound and in expressive representation. “Music, when soft voices die, / Vibrates in the memory” (Shelley

    Geometry meets semantics for semi-supervised monocular depth estimation

    Full text link
    Depth estimation from a single image represents a very exciting challenge in computer vision. While other image-based depth sensing techniques leverage on the geometry between different viewpoints (e.g., stereo or structure from motion), the lack of these cues within a single image renders ill-posed the monocular depth estimation task. For inference, state-of-the-art encoder-decoder architectures for monocular depth estimation rely on effective feature representations learned at training time. For unsupervised training of these models, geometry has been effectively exploited by suitable images warping losses computed from views acquired by a stereo rig or a moving camera. In this paper, we make a further step forward showing that learning semantic information from images enables to improve effectively monocular depth estimation as well. In particular, by leveraging on semantically labeled images together with unsupervised signals gained by geometry through an image warping loss, we propose a deep learning approach aimed at joint semantic segmentation and depth estimation. Our overall learning framework is semi-supervised, as we deploy groundtruth data only in the semantic domain. At training time, our network learns a common feature representation for both tasks and a novel cross-task loss function is proposed. The experimental findings show how, jointly tackling depth prediction and semantic segmentation, allows to improve depth estimation accuracy. In particular, on the KITTI dataset our network outperforms state-of-the-art methods for monocular depth estimation.Comment: 16 pages, Accepted to ACCV 201
    • 

    corecore