Search CORE

30,119 research outputs found

Ball-Scale Based Hierarchical Multi-Object Recognition in 3D Medical Images

Author: Bagci Ulas
Chen Xinjian
Udupa Jayaram K.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 05/02/2010
Field of study

This paper investigates, using prior shape models and the concept of ball scale (b-scale), ways of automatically recognizing objects in 3D images without performing elaborate searches or optimization. That is, the goal is to place the model in a single shot close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. This is achieved via the following set of key ideas: (a) A semi-automatic way of constructing a multi-object shape model assembly. (b) A novel strategy of encoding, via b-scale, the pose relationship between objects in the training images and their intensity patterns captured in b-scale images. (c) A hierarchical mechanism of positioning the model, in a one-shot way, in a given image from a knowledge of the learnt pose relationship and the b-scale image of the given image to be segmented. The evaluation results on a set of 20 routine clinical abdominal female and male CT data sets indicate the following: (1) Incorporating a large number of objects improves the recognition accuracy dramatically. (2) The recognition algorithm can be thought as a hierarchical framework such that quick replacement of the model assembly is defined as coarse recognition and delineation itself is known as finest recognition. (3) Scale yields useful information about the relationship between the model assembly and any given image such that the recognition results in a placement of the model close to the actual pose without doing any elaborate searches or optimization. (4) Effective object recognition can make delineation most accurate.Comment: This paper was published and presented in SPIE Medical Imaging 201

arXiv.org e-Print Archive

Crossref

PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes

Author: Fox Dieter
Narayanan Venkatraman
Schmidt Tanner
Xiang Yu
Publication venue
Publication date: 26/05/2018
Field of study

Estimating the 6D pose of known objects is important for robots to interact with the real world. The problem is challenging due to the variety of objects as well as the complexity of a scene caused by clutter and occlusions between objects. In this work, we introduce PoseCNN, a new Convolutional Neural Network for 6D object pose estimation. PoseCNN estimates the 3D translation of an object by localizing its center in the image and predicting its distance from the camera. The 3D rotation of the object is estimated by regressing to a quaternion representation. We also introduce a novel loss function that enables PoseCNN to handle symmetric objects. In addition, we contribute a large scale video dataset for 6D object pose estimation named the YCB-Video dataset. Our dataset provides accurate 6D poses of 21 objects from the YCB dataset observed in 92 videos with 133,827 frames. We conduct extensive experiments on our YCB-Video dataset and the OccludedLINEMOD dataset to show that PoseCNN is highly robust to occlusions, can handle symmetric objects, and provide accurate pose estimation using only color images as input. When using depth data to further refine the poses, our approach achieves state-of-the-art results on the challenging OccludedLINEMOD dataset. Our code and dataset are available at https://rse-lab.cs.washington.edu/projects/posecnn/.Comment: Accepted to RSS 201

arXiv.org e-Print Archive

Crossref

Multi-view Convolutional Neural Networks for 3D Shape Recognition

Author: Kalogerakis Evangelos
Learned-Miller Erik
Maji Subhransu
Su Hang
Publication venue
Publication date: 27/09/2015
Field of study

A longstanding question in computer vision concerns the representation of 3D shapes for recognition: should 3D shapes be represented with descriptors operating on their native 3D formats, such as voxel grid or polygon mesh, or can they be effectively represented with view-based descriptors? We address this question in the context of learning to recognize 3D shapes from a collection of their rendered views on 2D images. We first present a standard CNN architecture trained to recognize the shapes' rendered views independently of each other, and show that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art 3D shape descriptors. Recognition rates further increase when multiple views of the shapes are provided. In addition, we present a novel CNN architecture that combines information from multiple views of a 3D shape into a single and compact shape descriptor offering even better recognition performance. The same architecture can be applied to accurately recognize human hand-drawn sketches of shapes. We conclude that a collection of 2D views can be highly informative for 3D shape recognition and is amenable to emerging CNN architectures and their derivatives.Comment: v1: Initial version. v2: An updated ModelNet40 training/test split is used; results with low-rank Mahalanobis metric learning are added. v3 (ICCV 2015): A second camera setup without the upright orientation assumption is added; some accuracy and mAP numbers are changed slightly because a small issue in mesh rendering related to specularities is fixe

arXiv.org e-Print Archive

CiteSeerX

Crossref

Combining depth and intensity images to produce enhanced object detection for use in a robotic colony

Author: A Milella
A Ushma
G Anusha
K Mikolajczyk
N Kanopoulos
R Abdulmajeed
S Adhikari
S Harnad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/07/2017
Field of study

Robotic colonies that can communicate with each other and interact with their ambient environments can be utilized for a wide range of research and industrial applications. However amongst the problems that these colonies face is that of the isolating objects within an environment. Robotic colonies that can isolate objects within the environment can not only map that environment in de-tail, but interact with that ambient space. Many object recognition techniques ex-ist, however these are often complex and computationally expensive, leading to overly complex implementations. In this paper a simple model is proposed to isolate objects, these can then be recognize and tagged. The model will be using 2D and 3D perspectives of the perceptual data to produce a probability map of the outline of an object, therefore addressing the defects that exist with 2D and 3D image techniques. Some of the defects that will be addressed are; low level illumination and objects at similar depths. These issues may not be completely solved, however, the model provided will provide results confident enough for use in a robotic colony

Repository@Hull - Worktribe

Crossref