1,043 research outputs found
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
The Video Mesh: A Data Structure for Image-based Video Editing
This paper introduces the video mesh, a data structure for representing video as 2.5D "paper cutouts." The video mesh allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. The video mesh sparsely encodes optical flow as well as depth, and handles occlusion using local layering and alpha mattes. Motion is described by a sparse set of points tracked over time. Each point also stores a depth value. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. The user rotoscopes occluding contours and we introduce an algorithm to cut the video mesh along them. Object boundaries are refined with perpixel alpha values. The video mesh is at its core a set of texture mapped triangles, we leverage graphics hardware to enable interactive editing and rendering of a variety of effects. We demonstrate the effectiveness of our representation with a number of special effects including 3D viewpoint changes, object insertion, and depth-of-field manipulation
Multi-camera Realtime 3D Tracking of Multiple Flying Animals
Automated tracking of animal movement allows analyses that would not
otherwise be possible by providing great quantities of data. The additional
capability of tracking in realtime - with minimal latency - opens up the
experimental possibility of manipulating sensory feedback, thus allowing
detailed explorations of the neural basis for control of behavior. Here we
describe a new system capable of tracking the position and body orientation of
animals such as flies and birds. The system operates with less than 40 msec
latency and can track multiple animals simultaneously. To achieve these
results, a multi target tracking algorithm was developed based on the Extended
Kalman Filter and the Nearest Neighbor Standard Filter data association
algorithm. In one implementation, an eleven camera system is capable of
tracking three flies simultaneously at 60 frames per second using a gigabit
network of nine standard Intel Pentium 4 and Core 2 Duo computers. This
manuscript presents the rationale and details of the algorithms employed and
shows three implementations of the system. An experiment was performed using
the tracking system to measure the effect of visual contrast on the flight
speed of Drosophila melanogaster. At low contrasts, speed is more variable and
faster on average than at high contrasts. Thus, the system is already a useful
tool to study the neurobiology and behavior of freely flying animals. If
combined with other techniques, such as `virtual reality'-type computer
graphics or genetic manipulation, the tracking system would offer a powerful
new way to investigate the biology of flying animals.Comment: pdfTeX using libpoppler 3.141592-1.40.3-2.2 (Web2C 7.5.6), 18 pages
with 9 figure
3D sparse feature model using short baseline stereo and multiple view registration
This paper outlines a methodology to generate a distinctive object representation offline, using short-baseline stereo fundamentals to triangulate highly descriptive object features in multiple pairs of stereo images. A group of sparse 2.5D perspective views are built and the multiple views are then fused into a single sparse 3D model using a common 3D shape registration technique. Having prior knowledge, such as the proposed sparse feature model, is useful when detecting an object and estimating its pose for real-time systems like augmented reality
Content-Preserving Warps for 3D Video Stabilization
We describe a technique that transforms a video from a hand-held video camera so that it appears as if it were taken with a directed camera motion. Our method adjusts the video to appear as if it were taken from nearby viewpoints, allowing 3D camera movements to be simulated. By aiming only for perceptual plausibility, rather than accurate reconstruction, we are able to develop algorithms that can effectively recreate dynamic scenes from a single source video. Our technique first recovers the original 3D camera motion and a sparse set of 3D, static scene points using an off-the-shelf structure-frommotion system. Then, a desired camera path is computed either automatically (e.g., by fitting a linear or quadratic path) or interactively. Finally, our technique performs a least-squares optimization that computes a spatially-varying warp from each input video frame into an output frame. The warp is computed to both follow the sparse displacements suggested by the recovered 3D structure, and avoid deforming the content in the video frame. Our experiments on stabilizing challenging videos of dynamic scenes demonstrate the effectiveness of our technique
Relating Multimodal Imagery Data in 3D
This research develops and improves the fundamental mathematical approaches and techniques required to relate imagery and imagery derived multimodal products in 3D. Image registration, in a 2D sense, will always be limited by the 3D effects of viewing geometry on the target. Therefore, effects such as occlusion, parallax, shadowing, and terrain/building elevation can often be mitigated with even a modest amounts of 3D target modeling. Additionally, the imaged scene may appear radically different based on the sensed modality of interest; this is evident from the differences in visible, infrared, polarimetric, and radar imagery of the same site. This thesis develops a `model-centric\u27 approach to relating multimodal imagery in a 3D environment. By correctly modeling a site of interest, both geometrically and physically, it is possible to remove/mitigate some of the most difficult challenges associated with multimodal image registration. In order to accomplish this feat, the mathematical framework necessary to relate imagery to geometric models is thoroughly examined. Since geometric models may need to be generated to apply this `model-centric\u27 approach, this research develops methods to derive 3D models from imagery and LIDAR data. Of critical note, is the implementation of complimentary techniques for relating multimodal imagery that utilize the geometric model in concert with physics based modeling to simulate scene appearance under diverse imaging scenarios. Finally, the often neglected final phase of mapping localized image registration results back to the world coordinate system model for final data archival are addressed. In short, once a target site is properly modeled, both geometrically and physically, it is possible to orient the 3D model to the same viewing perspective as a captured image to enable proper registration. If done accurately, the synthetic model\u27s physical appearance can simulate the imaged modality of interest while simultaneously removing the 3-D ambiguity between the model and the captured image. Once registered, the captured image can then be archived as a texture map on the geometric site model. In this way, the 3D information that was lost when the image was acquired can be regained and properly related with other datasets for data fusion and analysis
- …