421 research outputs found

    Towards Modular Spatio-temporal Perception for Task-adapting Robots

    Get PDF
    In perception systems for object recognition, the advantage of multiple modalities, of combining approaches, and several views is emphasized, as they improve accuracy. However, there are great variances in the implementation, suggesting that there is no consensus yet on how to approach this problem. Nonetheless, we can identify some common features of the methods and propose a flexible system where existing and future approaches can be tested, compared and combined. We present a modular system in which perception routines can be easily added, and define the logic of making them work together based on the lessons learned from different experiments

    The Laminar Architecture of Visual Cortex and Image Processing Technology

    Full text link
    The mammalian neocortex is organized into layers which include circuits that form functional columns in cortical maps. A major unsolved problem concerns how bottom-up, top-down, and horizontal interactions are organized within cortical layers to generate adaptive behaviors. This article summarizes a model, called the LAMINART model, of how these interactions help visual cortex to realize: (1) the binding process whereby cortex groups distributed data into coherent object representations; (2) the attentional process whereby cortex selectively processes important events; and (3) the developmental and learning processes whereby cortex stably grows and tunes its circuits to match environmental constraints. Such Laminar Computing completes perceptual groupings that realize the property of Analog Coherence, whereby winning groupings bind together their inducing features without losing their ability to represent analog values of these features. Laminar Computing also efficiently unifies the computational requirements of preattentive filtering and grouping with those of attentional selection. It hereby shows how Adaptive Resonance Theory (ART) principles may be realized within the laminar circuits of neocortex. Applications include boundary segmentation and surface filling-in algorithms for processing Synthetic Aperture Radar images.Defense Advanced Research Projects Agency and the Office of Naval Research (N00014-95-1-0409); Office of Naval Research (N00014-95-1-0657

    Intrusive effects of task-irrelevant information on visual selective attention: semantics and size

    Get PDF
    Attentional selection is a mechanism by which incoming sensory information is prioritized for further, detailed and more effective, processing. Given that attended information is privileged by the sensory system, understanding and predicting what information is granted prioritization becomes an important endeavor. It has been argued that salient events as well as information that is related to the current goal of the organism (i.e., task-relevant) receive such priority. Here, we propose that attentional prioritization is not limited to task-relevance, and discuss evidence showing that task-irrelevant, non-salient, high-level properties of unattended objects, namely object meaning and size, influence attentional allocation. Such intrusion of non-salient task-irrelevant high-level information points to the need to re-conceptualize and formally modify current models of attentional guidance

    Towards Modular Spatio-temporal Perception for Task-adapting Robots

    Get PDF
    In perception systems for object recognition, the advantage of multiple modalities, of combining approaches, and several views is emphasized, as they improve accuracy. However, there are great variances in the implementation, suggesting that there is no consensus yet on how to approach this problem. Nonetheless, we can identify some common features of the methods and propose a flexible system where existing and future approaches can be tested, compared and combined. We present a modular system in which perception routines can be easily added, and define the logic of making them work together based on the lessons learned from different experiments

    Learning from graphically integrated 2D and 3D representations improves retention of neuroanatomy.

    Get PDF
    Visualizations in the form of computer-based learning environments are highly encouraged in science education, especially for teaching spatial material. Some spatial material, such as sectional neuroanatomy, is very challenging to learn. It involves learning the two dimensional (2D) representations that are sampled from the three dimensional (3D) object. In this study, a computer-based learning environment was used to explore the hypothesis that learning sectional neuroanatomy from a graphically integrated 2D and 3D representation will lead to better learning outcomes than learning from a sequential presentation. The integrated representation explicitly demonstrates the 2D-3D transformation and should lead to effective learning. This study was conducted using a computer graphical model of the human brain. There were two learning groups: Whole then Sections, and Integrated 2D3D. Both groups learned whole anatomy (3D neuroanatomy) before learning sectional anatomy (2D neuroanatomy). The Whole then Sections group then learned sectional anatomy using 2D representations only. The Integrated 2D3D group learned sectional anatomy from a graphically integrated 3D and 2D model. A set of tests for generalization of knowledge to interpreting biomedical images was conducted immediately after learning was completed. The order of presentation of the tests of generalization of knowledge was counterbalanced across participants to explore a secondary hypothesis of the study: preparation for future learning. If the computer-based instruction programs used in this study are effective tools for teaching anatomy, the participants should continue learning neuroanatomy with exposure to new representations. A test of long-term retention of sectional anatomy was conducted 4-8 weeks after learning was completed. The Integrated 2D3D group was better than the Whole then Sections group in retaining knowledge of difficult instances of sectional anatomy after the retention interval. The benefit of learning from an integrated 2D3D representation suggests that there are some spatial transformations which are better retained if they are learned through an explicit demonstration. Participants also showed evidence of continued learning on the tests of generalization with the help of cues and practice, even without feedback. This finding suggests that the computer-based learning programs used in this study were good tools for instruction of neuroanatomy

    Object categorization using biological models

    Get PDF
    Dissertação de mest., Engenharia Elétrica e Eletrónica (Tecnologias da Informação e Telecomunicação), Instituto Superior de Engenharia, Univ. do Algarve, 2013Humans are naturals at categorizing objects, i.e., at dividing them into groups depending on their features and surroundings. We do it easily and in real-time. Additionally, our Human Visual System (HVS) is the only one reliable for object detection, categorization and recognition; the latter events take place in the visual cortex, being object recognition achieved around 150-200ms, and occurring also a categorization-specific activation in prefrontal cortex before or around 100ms. This provides one of the evidences which substantiate that categorization is a more bottom-up process than recognition. Visual cortical area V1 is composed - among others - by simple and complex cells which are adjusted to different spatial frequencies (scales), orientations and disparity. These cell‟s responses were used to build a model for events detection in V1; these events are classified by type - lines and edges – and polarity - positive and negative. Being the goal of this thesis to develop a cortical model for object categorization - inspired in the HVS and based on 2D object views -, the V1 multi-scale events generated by the former model were used to accomplish that goal. In the developed categorization model the final category attributed to an object is the convergence of three similarity concepts which define in different ways the resemblance degree between an object and a certain category; the resemblance degree is therefore accomplished by comparing the V1 events between templates and objects. The resemblance degree or similarity percentage was calculated (a) on the first concept as the quotient between the number of common events between object and category templates (considering type and polarity) in all scales, and the number of object‟s events in all scales; (b) on the second concept the similarity percentage was calculated as the quotient between the number of common events between object and category templates (not considering type nor polarity) in all scales, and the number of object‟s events in all scales; (c) finally, on the third concept this ratio was calculated as the quotient between the number of common events between object and category templates (considering type and polarity) in all scales, and the category‟s “events number” in all scales. The final category assigned to an object is then (1st) a category on which the three concepts agree on and (2nd) the best scored one. For the proof of concept a database composed by 8 different categories and 10 objects per category was used; left and right profile views were chosen to represent each object. Regarding the 80 results obtained by categorizing 40 objects in both views, an average categorization success rate of 93.75% was accomplished, being 92.50% the success rate achieved for left profile, and 95.00% the one achieved for right profile; even each of the miscategorized images was attributed a category which is similar to its true one. In order to conclude the proof of concept, the model was also tested in terms of small invariance to rotation, scale and noise, having been then achieved high categorization success rates (above 82%)

    Holistic interpretation of visual data based on topology:semantic segmentation of architectural facades

    Get PDF
    The work presented in this dissertation is a step towards effectively incorporating contextual knowledge in the task of semantic segmentation. To date, the use of context has been confined to the genre of the scene with a few exceptions in the field. Research has been directed towards enhancing appearance descriptors. While this is unarguably important, recent studies show that computer vision has reached a near-human level of performance in relying on these descriptors when objects have stable distinctive surface properties and in proper imaging conditions. When these conditions are not met, humans exploit their knowledge about the intrinsic geometric layout of the scene to make local decisions. Computer vision lags behind when it comes to this asset. For this reason, we aim to bridge the gap by presenting algorithms for semantic segmentation of building facades making use of scene topological aspects. We provide a classification scheme to carry out segmentation and recognition simultaneously.The algorithm is able to solve a single optimization function and yield a semantic interpretation of facades, relying on the modeling power of probabilistic graphs and efficient discrete combinatorial optimization tools. We tackle the same problem of semantic facade segmentation with the neural network approach.We attain accuracy figures that are on-par with the state-of-the-art in a fully automated pipeline.Starting from pixelwise classifications obtained via Convolutional Neural Networks (CNN). These are then structurally validated through a cascade of Restricted Boltzmann Machines (RBM) and Multi-Layer Perceptron (MLP) that regenerates the most likely layout. In the domain of architectural modeling, there is geometric multi-model fitting. We introduce a novel guided sampling algorithm based on Minimum Spanning Trees (MST), which surpasses other propagation techniques in terms of robustness to noise. We make a number of additional contributions such as measure of model deviation which captures variations among fitted models

    Visual Clutter Study for Pedestrian Using Large Scale Naturalistic Driving Data

    Get PDF
    Some of the pedestrian crashes are due to driver’s late or difficult perception of pedestrian’s appearance. Recognition of pedestrians during driving is a complex cognitive activity. Visual clutter analysis can be used to study the factors that affect human visual search efficiency and help design advanced driver assistant system for better decision making and user experience. In this thesis, we propose the pedestrian perception evaluation model which can quantitatively analyze the pedestrian perception difficulty using naturalistic driving data. An efficient detection framework was developed to locate pedestrians within large scale naturalistic driving data. Visual clutter analysis was used to study the factors that may affect the driver’s ability to perceive pedestrian appearance. The candidate factors were explored by the designed exploratory study using naturalistic driving data and a bottom-up image-based pedestrian clutter metric was proposed to quantify the pedestrian perception difficulty in naturalistic driving data. Based on the proposed bottom-up clutter metrics and top-down pedestrian appearance based estimator, a Bayesian probabilistic pedestrian perception evaluation model was further constructed to simulate the pedestrian perception process

    A database for the digitization of the sedimentary architecture of fluvial systems: uses in pure and applied research

    Get PDF
    A relational database has been devised as a tool for the digitization of features relating to the sedimentary and geomorphic architecture of modern rivers and ancient fluvial successions, as derived from either original field studies or published examples. The system has been designed in a way that permits the inclusion of hard and soft data – comprising geometries and spatial and hierarchical relationships – referring to classified genetic units belonging to 3 different hierarchical levels, and assigned to stratigraphic volumes that are categorized in terms of deposystem boundary conditions and descriptive parameters. Several applications of the quantitative information generated through database interrogation have been explored, with the scope to demonstrate how a database methodology for the storage of sedimentary architecture data can be of use for both pure and applied sedimentary research. Firstly, an account is given of how the system can been employed for the creation of quantitative fluvial facies models, which summarize information on architectural styles associated with classes of depositional systems. The value of the approach is shown by contrasting results with traditional qualitative models. Secondly, database output on large-scale fluvial architecture has been used in the context of a comparative study aiming to investigate the role of basin-wide aggradation rates as predictors of fluvial architectural styles. The results contrast with what might be expected by commonly considered stratigraphic models; the main implication is the necessity to reconsider continental sequence stratigraphy models or their domain of applicability. This application further provides an example of how the methodology could be generalized to the study of the sensitivity of architecture to its controls. Thirdly, database output has been used to conduct a re-evaluation of previously-proposed approaches to the guidance of well-to-well correlations of subsurface fluvial channel bodies, applied in earlier studies. Making use of the same analogue information, a new probabilistic approach has been proposed as a way to inform or rank correlation panels of channel bodies across equally-spaced wells. Finally, the value of the system as an instrument for constraining object- and pixel-based stochastic structure-imitating models of fluvial sedimentary architecture is collectively demonstrated through a range of example applications employing database output
    • …
    corecore