85,142 research outputs found

    Level 2 perspective-taking distinguishes automatic and non-automatic belief-tracking

    Get PDF
    Little is known about whether human beings’ automatic mindreading is computationally restricted to processing a limited kind of content, and what exactly the nature of that signature limit might be. We developed a novel object-detection paradigm to test adults’ automatic processing in a Level 1 perspective-taking (L1PT) context (where an agent’s belief, but not his visuospatial perspective, is relevantly different) and in a Level 2 perspective-taking (L2PT) context (where both the agent’s belief and visuospatial perspective are relevantly different). Experiment 1 uncovered that adults’ reaction times in the L1PT task were helpfully speeded by a bystander’s irrelevant belief when tracking two homogenous objects but not in the L2PT task when tracking a single heterogeneous object. The limitation is especially striking given that the heterogeneous nature of the single object was fully revealed to participants as well as the bystander. The results were replicated in two further experiments, which confirmed that the selective modulation of adults’ reaction times was maintained when tracking the location of a single object (Experiment 2) and when attention checks were removed (Experiment 3). Our findings suggest that automatic mindreading draws upon a distinctively minimalist model of the mental that underspecifies representation of differences in perspective relative to an agent’s position in space

    Allowing content-based functionalities in segmentation-based coding schemes

    Get PDF
    This paper deals with the use of the segmentation tools and principles presented in [10] and [13] for allowing content-based functionalities. In this framework, means for supervised selection of objects in the scene are proposed. In addition, a technique for object tracking in the context of segmentation-based video coding is presented. The technique is independent of the type of segmentation approach used in the coding scheme. The algorithm relies on a double partition of the image that yields spatially homogeneous regions. This double partition permits to obtain the position and shape of the previous object in the current image while computing the projected partition. In order to demonstrate the potentialities of this algorithm, it is applied in a specific coding scheme so that content-based functionalities, such as selective coding, are allowed.Peer ReviewedPostprint (published version

    Self-Selective Correlation Ship Tracking Method for Smart Ocean System

    Full text link
    In recent years, with the development of the marine industry, navigation environment becomes more complicated. Some artificial intelligence technologies, such as computer vision, can recognize, track and count the sailing ships to ensure the maritime security and facilitates the management for Smart Ocean System. Aiming at the scaling problem and boundary effect problem of traditional correlation filtering methods, we propose a self-selective correlation filtering method based on box regression (BRCF). The proposed method mainly include: 1) A self-selective model with negative samples mining method which effectively reduces the boundary effect in strengthening the classification ability of classifier at the same time; 2) A bounding box regression method combined with a key points matching method for the scale prediction, leading to a fast and efficient calculation. The experimental results show that the proposed method can effectively deal with the problem of ship size changes and background interference. The success rates and precisions were higher than Discriminative Scale Space Tracking (DSST) by over 8 percentage points on the marine traffic dataset of our laboratory. In terms of processing speed, the proposed method is higher than DSST by nearly 22 Frames Per Second (FPS)

    Ambient Sound Provides Supervision for Visual Learning

    Full text link
    The sound of crashing waves, the roar of fast-moving cars -- sound conveys important information about the objects in our surroundings. In this work, we show that ambient sounds can be used as a supervisory signal for learning visual models. To demonstrate this, we train a convolutional neural network to predict a statistical summary of the sound associated with a video frame. We show that, through this process, the network learns a representation that conveys information about objects and scenes. We evaluate this representation on several recognition tasks, finding that its performance is comparable to that of other state-of-the-art unsupervised learning methods. Finally, we show through visualizations that the network learns units that are selective to objects that are often associated with characteristic sounds.Comment: ECCV 201

    Towards binocular active vision in a robot head system

    Get PDF
    This paper presents the first results of an investigation and pilot study into an active, binocular vision system that combines binocular vergence, object recognition and attention control in a unified framework. The prototype developed is capable of identifying, targeting, verging on and recognizing objects in a highly-cluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in a symbolic space without creating explicit pixel-space maps. The system structure is based on the ‘searchlight metaphor’ of biological systems. We present results of a first pilot investigation that yield a maximum vergence error of 6.4 pixels, while seven of nine known objects were recognized in a high-cluttered environment. Finally a “stepping stone” visual search strategy was demonstrated, taking a total of 40 saccades to find two known objects in the workspace, neither of which appeared simultaneously within the Field of View resulting from any individual saccade
    • 

    corecore