821 research outputs found
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
MoFA: Model-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction
In this work we propose a novel model-based deep convolutional autoencoder
that addresses the highly challenging problem of reconstructing a 3D human face
from a single in-the-wild color image. To this end, we combine a convolutional
encoder network with an expert-designed generative model that serves as
decoder. The core innovation is our new differentiable parametric decoder that
encapsulates image formation analytically based on a generative model. Our
decoder takes as input a code vector with exactly defined semantic meaning that
encodes detailed face pose, shape, expression, skin reflectance and scene
illumination. Due to this new way of combining CNN-based with model-based face
reconstruction, the CNN-based encoder learns to extract semantically meaningful
parameters from a single monocular input image. For the first time, a CNN
encoder and an expert-designed generative model can be trained end-to-end in an
unsupervised manner, which renders training on very large (unlabeled) real
world data feasible. The obtained reconstructions compare favorably to current
state-of-the-art approaches in terms of quality and richness of representation.Comment: International Conference on Computer Vision (ICCV) 2017 (Oral), 13
page
Towards Full-Body Gesture Analysis and Recognition
With computers being embedded in every walk of our life, there is an increasing demand forintuitive devices for human-computer interaction. As human beings use gestures as importantmeans of communication, devices based on gesture recognition systems will be effective for humaninteraction with computers. However, it is very important to keep such a system as non-intrusive aspossible, to reduce the limitations of interactions. Designing such non-intrusive, intuitive, camerabasedreal-time gesture recognition system has been an active area of research research in the fieldof computer vision.Gesture recognition invariably involves tracking body parts. We find many research works intracking body parts like eyes, lips, face etc. However, there is relatively little work being done onfull body tracking. Full-body tracking is difficult because it is expensive to model the full-body aseither 2D or 3D model and to track its movements.In this work, we propose a monocular gesture recognition system that focuses on recognizing a setof arm movements commonly used to direct traffic, guiding aircraft landing and for communicationover long distances. This is an attempt towards implementing gesture recognition systems thatrequire full body tracking, for e.g. an automated recognition semaphore flag signaling system.We have implemented a robust full-body tracking system, which forms the backbone of ourgesture analyzer. The tracker makes use of two dimensional link-joint (LJ) model, which representsthe human body, for tracking. Currently, we track the movements of the arms in a video sequence,however we have future plans to make the system real-time. We use distance transform techniquesto track the movements by fitting the parameters of LJ model in every frames of the video captured.The tracker\u27s output is fed a to state-machine which identifies the gestures made. We haveimplemented this system using four sub-systems. Namely1. Background subtraction sub-system, using Gaussian models and median filters.2. Full-body Tracker, using L-J Model APIs3. Quantizer, that converts tracker\u27s output into defined alphabets4. Gesture analyzer, that reads the alphabets into action performed.Currently, our gesture vocabulary contains gestures involving arms moving up and down which canbe used for detecting semaphore, flag signaling system. Also we can detect gestures like clappingand waving of arms
Epälambertilaiset pinnat ja niiden haasteet konenäössä
This thesis regards non-Lambertian surfaces and their challenges, solutions and study in computer vision. The physical theory for understanding the phenomenon is built first, using the Lambertian reflectance model, which defines Lambertian surfaces as ideally diffuse surfaces, whose luminance is isotropic and the luminous intensity obeys Lambert's cosine law. From these two assumptions, non-Lambertian surfaces violate at least the cosine law and are consequently specularly reflecting surfaces, whose perceived brightness is dependent from the viewpoint. Thus non-Lambertian surfaces violate also brightness and colour constancies, which assume that the brightness and colour of same real-world points stays constant across images. These assumptions are used, for example, in tracking and feature matching and thus non-Lambertian surfaces pose complications for object reconstruction and navigation among other tasks in the field of computer vision.
After formulating the theoretical foundation of necessary physics and a more general reflectance model called the bi-directional reflectance distribution function, a comprehensive literature review into significant studies regarding non-Lambertian surfaces is conducted. The primary topics of the survey include photometric stereo and navigation systems, while considering other potential fields, such as fusion methods and illumination invariance. The goal of the survey is to formulate a detailed and in-depth answer to what methods can be used to solve the challenges posed by non-Lambertian surfaces, what are these methods' strengths and weaknesses, what are the used datasets and what remains to be answered by further research. After the survey, a dataset is collected and presented, and an outline of another dataset to be published in an upcoming paper is presented. Then a general discussion about the survey and the study is undertaken and conclusions along with proposed future steps are introduced
- …