39,997 research outputs found
EchoFusion: Tracking and Reconstruction of Objects in 4D Freehand Ultrasound Imaging without External Trackers
Ultrasound (US) is the most widely used fetal imaging technique. However, US
images have limited capture range, and suffer from view dependent artefacts
such as acoustic shadows. Compounding of overlapping 3D US acquisitions into a
high-resolution volume can extend the field of view and remove image artefacts,
which is useful for retrospective analysis including population based studies.
However, such volume reconstructions require information about relative
transformations between probe positions from which the individual volumes were
acquired. In prenatal US scans, the fetus can move independently from the
mother, making external trackers such as electromagnetic or optical tracking
unable to track the motion between probe position and the moving fetus. We
provide a novel methodology for image-based tracking and volume reconstruction
by combining recent advances in deep learning and simultaneous localisation and
mapping (SLAM). Tracking semantics are established through the use of a
Residual 3D U-Net and the output is fed to the SLAM algorithm. As a proof of
concept, experiments are conducted on US volumes taken from a whole body fetal
phantom, and from the heads of real fetuses. For the fetal head segmentation,
we also introduce a novel weak annotation approach to minimise the required
manual effort for ground truth annotation. We evaluate our method
qualitatively, and quantitatively with respect to tissue discrimination
accuracy and tracking robustness.Comment: MICCAI Workshop on Perinatal, Preterm and Paediatric Image analysis
(PIPPI), 201
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm
Functional endoscopic sinus surgery (FESS) is a surgical procedure used to
treat acute cases of sinusitis and other sinus diseases. FESS is fast becoming
the preferred choice of treatment due to its minimally invasive nature.
However, due to the limited field of view of the endoscope, surgeons rely on
navigation systems to guide them within the nasal cavity. State of the art
navigation systems report registration accuracy of over 1mm, which is large
compared to the size of the nasal airways. We present an anatomically
constrained video-CT registration algorithm that incorporates multiple video
features. Our algorithm is robust in the presence of outliers. We also test our
algorithm on simulated and in-vivo data, and test its accuracy against
degrading initializations.Comment: 8 pages, 4 figures, MICCA
Automatic Image Registration in Infrared-Visible Videos using Polygon Vertices
In this paper, an automatic method is proposed to perform image registration
in visible and infrared pair of video sequences for multiple targets. In
multimodal image analysis like image fusion systems, color and IR sensors are
placed close to each other and capture a same scene simultaneously, but the
videos are not properly aligned by default because of different fields of view,
image capturing information, working principle and other camera specifications.
Because the scenes are usually not planar, alignment needs to be performed
continuously by extracting relevant common information. In this paper, we
approximate the shape of the targets by polygons and use affine transformation
for aligning the two video sequences. After background subtraction, keypoints
on the contour of the foreground blobs are detected using DCE (Discrete Curve
Evolution)technique. These keypoints are then described by the local shape at
each point of the obtained polygon. The keypoints are matched based on the
convexity of polygon's vertices and Euclidean distance between them. Only good
matches for each local shape polygon in a frame, are kept. To achieve a global
affine transformation that maximises the overlapping of infrared and visible
foreground pixels, the matched keypoints of each local shape polygon are stored
temporally in a buffer for a few number of frames. The matrix is evaluated at
each frame using the temporal buffer and the best matrix is selected, based on
an overlapping ratio criterion. Our experimental results demonstrate that this
method can provide highly accurate registered images and that we outperform a
previous related method
- …