Search CORE

818,157 research outputs found

3D ShapeNets: A Deep Representation for Volumetric Shapes

Author: Khosla Aditya
Song Shuran
Tang Xiaoou
Wu Zhirong
Xiao Jianxiong
Yu Fisher
Zhang Linguang
Publication venue
Publication date: 01/01/2015
Field of study

3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representations automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet -- a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.Comment: to be appeared in CVPR 201

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

ZENODO

Reading as Active Sensing: A Computational Model of Gaze Planning in Word Recognition

Author: Dimitri Ognibene
Giovanni Pezzulo
Giovanni Pezzulo
Marcello Ferro
Marcello Ferro
Vito Pirrelli
Publication venue: Frontiers Research Foundation
Publication date: 01/01/2010
Field of study

We offer a computational model of gaze planning during reading that consists of two main components: a lexical representation network, acquiring lexical representations from input texts (a subset of the Italian CHILDES database), and a gaze planner, designed to recognize written words by mapping strings of characters onto lexical representations. The model implements an active sensing strategy that selects which characters of the input string are to be fixated, depending on the predictions dynamically made by the lexical representation network. We analyze the developmental trajectory of the system in performing the word recognition task as a function of both increasing lexical competence, and correspondingly increasing lexical prediction ability. We conclude by discussing how our approach can be scaled up in the context of an active sensing strategy applied to a robotic setting

Crossref

Directory of Open Access Journals

PubMed Central

PUblication MAnagement

Working Memory Networks for Learning Temporal Order, with Application to 3-D Visual Object Recognition

Author: Bradski Gary
Carpenter Gail A.
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/04/1991
Field of study

Working memory neural networks are characterized which encode the invariant temporal order of sequential events. Inputs to the networks, called Sustained Temporal Order REcurrent (STORE) models, may be presented at widely differing speeds, durations, and interstimulus intervals. The STORE temporal order code is designed to enable all emergent groupings of sequential events to be stably learned and remembered in real time, even as new events perturb the system. Such a competence is needed in neural architectures which self-organize learned codes for variable-rate speech perception, sensory-motor planning, or 3-D visual object recognition. Using such a working memory, a self-organizing architecture for invariant 3-D visual object recognition is described. The new model is based on the model of Seibert and Waxman (1990a), which builds a 3-D representation of an object from a temporally ordered sequence of its 2-D aspect graphs. The new model, called an ARTSTORE model, consists of the following cascade of processing modules: Invariant Preprocessor --> ART 2 --> STORE Model --> ART 2 --> Outstar Network.Defense Advanced Research Projects Agency (90-0083); British Petroleum (89-A1-1204); National Science Foundation (IRI 90-00530, IRI 87-16960); Air Force Office of Scientific Research (90-128, 90-0175

Boston University Institutional Repository (OpenBU)

Working memory networks for learning multiple groupings of temporally ordered events: applications to 3-D visual object recognition

Author: Bradski Gary
Carpenter Gail A.
Grossberg Stephen
Publication venue: Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems
Publication date: 01/02/1991
Field of study

Working memory neural networks are characterized which encode the invariant temporal order of sequential events that may be presented at widely differing speeds, durations, and interstimulus intervals. This temporal order code is designed to enable all possible groupings of sequential events to be stably learned and remembered in real time, even as new events perturb the system. Such a competence is needed in neural architectures which self-organize learned codes for variable-rate speech perception, sensory-motor planning, or 3-D visual object recognition. Using such a working memory, a self-organizing architecture for invariant 3-D visual object recognition is described that is based on the model of Seibert and Waxman [1].Air Force Office of Scientific Research (90-128, 90-0175); British Petroleum (89-A-1204); Defense Advanced Research Projects Agency (90-0083); National Science Foundation (IRI 90-00530, IRI 87-16960

Boston University Institutional Repository (OpenBU)