Search CORE

156 research outputs found

DeepContext: Context-Encoding Neural Pathways for 3D Holistic Scene Understanding

Author: Bai Mingru
Izadi Shahram
Kohli Pushmeet
Xiao Jianxiong
Zhang Yinda
Publication venue
Publication date: 01/01/2017
Field of study

While deep neural networks have led to human-level performance on computer vision tasks, they have yet to demonstrate similar gains for holistic scene understanding. In particular, 3D context has been shown to be an extremely important cue for scene understanding - yet very little research has been done on integrating context information with deep models. This paper presents an approach to embed 3D context into the topology of a neural network trained to perform holistic scene understanding. Given a depth image depicting a 3D scene, our network aligns the observed scene with a predefined 3D scene template, and then reasons about the existence and location of each object within the scene template. In doing so, our model recognizes multiple objects in a single forward pass of a 3D convolutional neural network, capturing both global scene and local object information simultaneously. To create training data for this 3D network, we generate partly hallucinated depth images which are rendered by replacing real objects with a repository of CAD models of the same object category. Extensive experiments demonstrate the effectiveness of our algorithm compared to the state-of-the-arts. Source code and data are available at http://deepcontext.cs.princeton.edu.Comment: Accepted by ICCV201

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

BRANCHING NEURAL NETWORKS

Author: Biçici Ufuk Can
Izadi Shahram
Keskin Cem
Publication venue: Technical Disclosure Commons
Publication date: 13/06/2018
Field of study

A conditional deep learning model that learns specialized representations on a decision tree is described. Unlike similar methods taking a probabilistic mixture of experts (MoE) approach, a feature augmentation based method is used to jointly train all network and decision parameters using back–propagation, which allows for deterministic binary decisions at both training and test time, specializing subtrees exclusively to clusters of data. Feature augmentation involves combining intermediate representations with scores or confidences assigned to branches. Each representation is augmented with all of the scores assigned to the active branch on the computational path to encode the entire path information, which is essential for efficient training of decision functions. These networks are referred to as Branching Neural Networks (BNNs). As this is an approach that is orthogonal to many other neural network compression methods, such algorithms can be combined to achieve much higher compression rates and further speedups

Technical Disclosure Common

Learning to Navigate the Energy Landscape

Author: Dai Angela
Izadi Shahram
Keskin Cem
Kohli Pushmeet
Nießner Matthias
Torr Philip
Valentin Julien
Publication venue
Publication date: 18/03/2016
Field of study

In this paper, we present a novel and efficient architecture for addressing computer vision problems that use `Analysis by Synthesis'. Analysis by synthesis involves the minimization of the reconstruction error which is typically a non-convex function of the latent target variables. State-of-the-art methods adopt a hybrid scheme where discriminatively trained predictors like Random Forests or Convolutional Neural Networks are used to initialize local search algorithms. While these methods have been shown to produce promising results, they often get stuck in local optima. Our method goes beyond the conventional hybrid architecture by not only proposing multiple accurate initial solutions but by also defining a navigational structure over the solution space that can be used for extremely efficient gradient-free local search. We demonstrate the efficacy of our approach on the challenging problem of RGB Camera Relocalization. To make the RGB camera relocalization problem particularly challenging, we introduce a new dataset of 3D environments which are significantly larger than those found in other publicly-available datasets. Our experiments reveal that the proposed method is able to achieve state-of-the-art camera relocalization results. We also demonstrate the generalizability of our approach on Hand Pose Estimation and Image Retrieval tasks

arXiv.org e-Print Archive

Crossref

User Interface Device with Actuated Buttons

Author: Butler Alex
Hodges Steve
Hook Jonathan David
Izadi Shahram
Taylor Stuart
Villar Nicolas
Publication venue
Publication date: 07/06/2012
Field of study

A user interface device with actuated buttons is described. In an embodiment, the user interface device comprises two or more buttons and the motion of the buttons is controlled by actuators under software control such that their motion is inter-related. The position or motion of the buttons may provide a user with feedback about the current state of a software program they are using or provide them with enhanced user input functionality. In another embodiment, the ability to move the buttons is used to reconfigure the user interface buttons and this may be performed dynamically, based on the current state of the software program, or may be performed dependent upon the software program being used. The user interface device may be a peripheral device, such as a mouse or keyboard, or may be integrated within a computing device such as a games device

White Rose Research Online

Quick and dirty : streamlined 3D scanning in archaeology

Author: Bennett Peter
Chrysanthi Angeliki
Earl Graeme
Fraser Mike
Izadi Shahram
Knibbe Jarrod
Marshall Mark
O'Hara Kenton
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

Capturing data is a key part of archaeological practice, whether for preserving records or to aid interpretation. But the technologies used are complex and expensive, resulting in time-consuming processes associated with their use. These processes force a separation between ongoing interpretive work and capture. Through two field studies we elicit more detail as to what is important about this interpretive work and what might be gained through a closer integration of capture technology with these practices. Drawing on these insights, we go on to present a novel, portable, wireless 3D modeling system that emphasizes "quick and dirty" capture. We discuss its design rational in relation to our field observations and evaluate this rationale further by giving the system to archaeological experts to explore in a variety of settings. While our device compromises on the resolution of traditional 3D scanners, its support of interpretation through emphasis on real-time capture, review and manipulability suggests it could be a valuable tool for the future of archaeology

Southampton (e-Prints Soton)

Crossref

Sheffield Hallam University Research Archive

King's Research Portal

Explore Bristol Research