Search CORE

20,301 research outputs found

Not all visual symmetry is equal: partially distinct neural bases for vertical and horizontal symmetry

Author: Bona S.
Bona S.
Cattaneo Z.
Cattaneo Z.
Silvanto J.
Silvanto J.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Visual mirror symmetry plays an important role in visual perception in both human and animal vision; its importance is reflected in the fact that it can be extracted automatically during early stages of visual processing. However, how this extraction is implemented at the cortical level remains an open question. Given the importance of symmetry in visual perception, one possibility is that there is a network which extracts all types of symmetry irrespective of axis of orientation; alternatively, symmetry along different axes might be encoded by different brain regions, implying that that there is no single neural mechanism for symmetry processing. Here we used fMRI-guided transcranial magnetic stimulation (TMS) to compare the neural basis of the two main types of symmetry found in the natural world, vertical and horizontal symmetry. TMS was applied over either right Lateral Occipital Cortex (LO), right Occipital Face Area (OFA) or Vertex while participants were asked to detect symmetry in low-level dot configurations. Whereas detection of vertical symmetry was impaired by TMS over both LO and OFA, detection of horizontal symmetry was delayed by stimulation of LO only. Thus, different types of visual symmetry rely on partially distinct cortical networks

WestminsterResearch

A Novel Optical/digital Processing System for Pattern Recognition

Author: Boone Bradley G.
Shukla Oodaye B.
Publication venue
Publication date: 01/02/1993
Field of study

This paper describes two processing algorithms that can be implemented optically: the Radon transform and angular correlation. These two algorithms can be combined in one optical processor to extract all the basic geometric and amplitude features from objects embedded in video imagery. We show that the internal amplitude structure of objects is recovered by the Radon transform, which is a well-known result, but, in addition, we show simulation results that calculate angular correlation, a simple but unique algorithm that extracts object boundaries from suitably threshold images from which length, width, area, aspect ratio, and orientation can be derived. In addition to circumventing scale and rotation distortions, these simulations indicate that the features derived from the angular correlation algorithm are relatively insensitive to tracking shifts and image noise. Some optical architecture concepts, including one based on micro-optical lenslet arrays, have been developed to implement these algorithms. Simulation test and evaluation using simple synthetic object data will be described, including results of a study that uses object boundaries (derivable from angular correlation) to classify simple objects using a neural network

NASA Technical Reports Server

An original framework for understanding human actions and body language by using deep neural networks

Author: MASSARONI CRISTIANO
Publication venue
Publication date: 28/02/2020
Field of study

The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

Archivio della ricerca- Università di Roma La Sapienza

PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes

Author: Fox Dieter
Narayanan Venkatraman
Schmidt Tanner
Xiang Yu
Publication venue
Publication date: 26/05/2018
Field of study

Estimating the 6D pose of known objects is important for robots to interact with the real world. The problem is challenging due to the variety of objects as well as the complexity of a scene caused by clutter and occlusions between objects. In this work, we introduce PoseCNN, a new Convolutional Neural Network for 6D object pose estimation. PoseCNN estimates the 3D translation of an object by localizing its center in the image and predicting its distance from the camera. The 3D rotation of the object is estimated by regressing to a quaternion representation. We also introduce a novel loss function that enables PoseCNN to handle symmetric objects. In addition, we contribute a large scale video dataset for 6D object pose estimation named the YCB-Video dataset. Our dataset provides accurate 6D poses of 21 objects from the YCB dataset observed in 92 videos with 133,827 frames. We conduct extensive experiments on our YCB-Video dataset and the OccludedLINEMOD dataset to show that PoseCNN is highly robust to occlusions, can handle symmetric objects, and provide accurate pose estimation using only color images as input. When using depth data to further refine the poses, our approach achieves state-of-the-art results on the challenging OccludedLINEMOD dataset. Our code and dataset are available at https://rse-lab.cs.washington.edu/projects/posecnn/.Comment: Accepted to RSS 201

arXiv.org e-Print Archive

Crossref