8,354 research outputs found

    Image Mapping and Visual Attention on a Sensory Ego-Sphere

    Get PDF

    System and method for image mapping and visual attention

    Get PDF
    A method is described for mapping dense sensory data to a Sensory Ego Sphere (SES). Methods are also described for finding and ranking areas of interest in the images that form a complete visual scene on an SES. Further, attentional processing of image data is best done by performing attentional processing on individual full-size images from the image sequence, mapping each attentional location to the nearest node, and then summing attentional locations at each node

    A moving observer in a three-dimensional world

    Get PDF
    For many tasks, such as retrieving a previously viewed object, an observer must form a representation of the world at one location and use it at another. A world-based 3D reconstruction of the scene built up from visual information would fulfil this requirement, something computer vision now achieves with great speed and accuracy. However, I argue that it is neither easy nor necessary for the brain to do this. I discuss biologically plausible alternatives, including the possibility of avoiding 3D coordinate frames such as ego-centric and world-based representations. For example, the distance, slant and local shape of surfaces dictate the propensity of visual features to move in the image with respect to one another as the observer’s perspective changes (through movement or binocular viewing). Such propensities can be stored without the need for 3D reference frames. The problem of representing a stable scene in the face of continual head and eye movements is an appropriate starting place for understanding the goal of 3D vision, more so, I argue, than the case of a static binocular observer

    Perception and action without 3D coordinate frames

    Get PDF
    Neuroscientists commonly assume that the brain generates representations of a scene in various non-retinotopic 3D coordinate frames, for example in 'egocentric' and 'allocentric' frames. Although neurons in early visual cortex might be described as representing a scene in an eye-centred frame, using 2 dimensions of visual direction and one of binocular disparity, there is no convincing evidence of similarly organized cortical areas using non-retinotopic 3D coordinate frames nor of any systematic transfer of information from one frame to another. We propose that perception and action in a 3D world could be achieved without generating ego- or allocentric 3D coordinate frames. Instead, we suggest that the fundamental operation the brain carries out is to compare a long state vector with a matrix of weights (essentially, a long look-up table) to choose an output (often, but not necessarily, a motor output). The processes involved in perception of a 3D scene and action within it depend, we suggest, on successive iterations of this basic operation. Advantages of this proposal include the fact that it relies on computationally well-defined operations corresponding to well-established neural processes. Also, we argue that from a philosophical perspective it is at least as plausible as theories postulating 3D coordinate frames. Finally, we suggest a variety of experiments that would falsify our claim

    Perception and action without 3D coordinate frames

    Get PDF
    Neuroscientists commonly assume that the brain generates representations of a scene in various non-retinotopic 3D coordinate frames, for example in 'egocentric' and 'allocentric' frames. Although neurons in early visual cortex might be described as representing a scene in an eye-centred frame, using 2 dimensions of visual direction and one of binocular disparity, there is no convincing evidence of similarly organized cortical areas using non-retinotopic 3D coordinate frames nor of any systematic transfer of information from one frame to another. We propose that perception and action in a 3D world could be achieved without generating ego- or allocentric 3D coordinate frames. Instead, we suggest that the fundamental operation the brain carries out is to compare a long state vector with a matrix of weights (essentially, a long look-up table) to choose an output (often, but not necessarily, a motor output). The processes involved in perception of a 3D scene and action within it depend, we suggest, on successive iterations of this basic operation. Advantages of this proposal include the fact that it relies on computationally well-defined operations corresponding to well-established neural processes. Also, we argue that from a philosophical perspective it is at least as plausible as theories postulating 3D coordinate frames. Finally, we suggest a variety of experiments that would falsify our claim

    Felt_space infrastructure: Hyper vigilant spatiality to valence the visceral dimension

    Get PDF
    Felt_space infrastructure: Hypervigilant spatiality to valence the visceral dimension. This thesis evolves perception as a hypothesis to reframe architectural praxis negotiated through agent-situation interaction. The research questions the geometric principles of architectural ordination to originate the ‘felt_space infrastructure’, a relational system of measurement concerned with the role of perception in mediating sensory space and the cognised environment. The methodological model for this research fuses perception and environmental stimuli, into a consistent generative process that penetrates the inner essence of space, to reveal the visceral parameter. These concepts are applied to develop a ‘coefficient of affordance’ typology, ‘hypervigilant’ tool set, and ‘cognitive_tope’ design methodology. Thus, by extending the architectural platform to consider perception as a design parameter, the thesis interprets the ‘inference schema’ as an instructional model to coordinate the acquisition of spatial reality through tensional and counter-tensional feedback dynamics. Three site-responsive case studies are used to advance the thesis. The first case study is descriptive and develops a typology of situated cognition to extend the ‘granularity’ of perceptual sensitisation (i.e. a fine-grained means of perceiving space). The second project is relational and questions how mapping can coordinate perceptual, cognitive and associative attention, as a ‘multi-webbed vector field’ comprised of attractors and deformations within a viewer-centred gravitational space. The third case study is causal, and demonstrates how a transactional-biased schema can generate, amplify and attenuate perceptual misalignment, thus triggering a visceral niche. The significance of the research is that it progresses generative perception as an additional variable for spatial practice, and promotes transactional methodologies to gain enhanced modes of spatial acuity to extend the repertoire of architectural practice

    Vehicle Detection for RCTA/ANS (Autonomous Navigation System)

    Get PDF
    Using a stereo camera pair, imagery is acquired and processed through the JPLV stereo processing pipeline. From this stereo data, large 3D blobs are found. These blobs are then described and classified by their shape to determine which are vehicles and which are not. Prior vehicle detection algorithms are either targeted to specific domains, such as following lead cars, or are intensity- based methods that involve learning typical vehicle appearances from a large corpus of training data. In order to detect vehicles, the JPL Vehicle Detection (JVD) algorithm goes through the following steps: 1. Take as input a left disparity image and left rectified image from JPLV stereo. 2. Project the disparity data onto a two-dimensional Cartesian map. 3. Perform some post-processing of the map built in the previous step in order to clean it up. 4. Take the processed map and find peaks. For each peak, grow it out into a map blob. These map blobs represent large, roughly vehicle-sized objects in the scene. 5. Take these map blobs and reject those that do not meet certain criteria. Build descriptors for the ones that remain. Pass these descriptors onto a classifier, which determines if the blob is a vehicle or not. The probability of detection is the probability that if a vehicle is present in the image, is visible, and un-occluded, then it will be detected by the JVD algorithm. In order to estimate this probability, eight sequences were ground-truthed from the RCTA (Robotics Collaborative Technology Alliances) program, totaling over 4,000 frames with 15 unique vehicles. Since these vehicles were observed at varying ranges, one is able to find the probability of detection as a function of range. At the time of this reporting, the JVD algorithm was tuned to perform best at cars seen from the front, rear, or either side, and perform poorly on vehicles seen from oblique angles

    Expanded skin

    Get PDF
    Our digital interfaces have been degrading human sensory intelligence by limiting our body to only vision and the first two fingers. Despite the high level of available technologies, we do not fully utilize them due to our lack of awareness of its applicability in more various aspects than just media being consumed. It is also because of its inaccessibility in terms of human–computer interaction (HCI) beyond our sense of sight and touch screens. Those technologies have been key elements in all of my works, since my ultimate position is to redirect the technology in a way that could enhance human sensory intelligence. I believe that the digital environment around us has already arrived at a point where the current technologies create an enhanced sphere of human sensory experience. My practice is focused on how to restructure the invisible interaction system between humans and the digital medium and expand our sensory experience through the interaction. I de-familiarize and re-frame the invisible interactions into clear inputs and outputs to raise autonomy in this relationship by connecting our physical body to some synthetic body as an extension of our own. Mostly, I translate the sense of self by observing and analyzing our body gestures and designing a framework for an intensification of a sense. This practice ultimately aims to design the extended skin ego, our actual skin’s sense of self, by re-purposing the technology from a separate entity to extension of our experiential being. In this book I will share framing anecdotes, specific scientific foundations such as octopi consciousness, and my experiments that were designed in the R&D process. The results from experiments will be given as a proposal

    Architecture for Multiple Interacting Robot Intelligences

    Get PDF
    An architecture for robot intelligence enables a robot to learn new behaviors and create new behavior sequences autonomously and interact with a dynamically changing environment. Sensory information is mapped onto a Sensory Ego-Sphere (SES) that rapidly identifies important changes in the environment and functions much like short term memory. Behaviors are stored in a database associative memory (DBAM) that creates an active map from the robot's current state to a goal state and functions much like long term memory. A dream state converts recent activities stored in the SES and creates or modifies behaviors in the DBAM
    • …
    corecore