13,320 research outputs found
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery
One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
Towards retrieving force feedback in robotic-assisted surgery: a supervised neuro-recurrent-vision approach
Robotic-assisted minimally invasive surgeries have gained a lot of popularity over conventional procedures as they offer many benefits to both surgeons and patients. Nonetheless, they still suffer from some limitations that affect their outcome. One of them is the lack of force feedback which restricts the surgeon's sense of touch and might reduce precision during a procedure. To overcome this limitation, we propose a novel force estimation approach that combines a vision based solution with supervised learning to estimate the applied force and provide the surgeon with a suitable representation of it. The proposed solution starts with extracting the geometry of motion of the heart's surface by minimizing an energy functional to recover its 3D deformable structure. A deep network, based on a LSTM-RNN architecture, is then used to learn the relationship between the extracted visual-geometric information and the applied force, and to find accurate mapping between the two. Our proposed force estimation solution avoids the drawbacks usually associated with force sensing devices, such as biocompatibility and integration issues. We evaluate our approach on phantom and realistic tissues in which we report an average root-mean square error of 0.02 N.Peer ReviewedPostprint (author's final draft
Evaluation of trackers for Pan-Tilt-Zoom Scenarios
Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in
computer vision for many years. Compared to tracking with a still camera, the
images captured with a PTZ camera are highly dynamic in nature because the
camera can perform large motion resulting in quickly changing capture
conditions. Furthermore, tracking with a PTZ camera involves camera control to
position the camera on the target. For successful tracking and camera control,
the tracker must be fast enough, or has to be able to predict accurately the
next position of the target. Therefore, standard benchmarks do not allow to
assess properly the quality of a tracker for the PTZ scenario. In this work, we
use a virtual PTZ framework to evaluate different tracking algorithms and
compare their performances. We also extend the framework to add target position
prediction for the next frame, accounting for camera motion and processing
delays. By doing this, we can assess if predicting can make long-term tracking
more robust as it may help slower algorithms for keeping the target in the
field of view of the camera. Results confirm that both speed and robustness are
required for tracking under the PTZ scenario.Comment: 6 pages, 2 figures, International Conference on Pattern Recognition
and Artificial Intelligence 201
Depth estimation of optically transparent laser-driven microrobots
Six degree-of-freedom (DoF) pose feedback is essential for the development of closed-loop control techniques for microrobotics. This paper presents two methods for depth estimation of transparent microrobots inside an Optical Tweezers (OT) setup using image sharpness measurements and model-based tracking. The x-y position and the 3D orientation of the object are estimated using online model-based template matching. The proposed depth estimation methodologies are validated experimentally by comparing the results with the ground truth
Tools for fluid simulation control in computer graphics
L’animation basée sur la physique peut générer des systèmes aux comportements complexes
et réalistes. Malheureusement, contrôler de tels systèmes est une tâche ardue. Dans le cas
de la simulation de fluide, le processus de contrôle est particulièrement complexe. Bien
que de nombreuses méthodes et outils ont été mis au point pour simuler et faire le rendu
de fluides, trop peu de méthodes offrent un contrôle efficace et intuitif sur une simulation
de fluide. Étant donné que le coût associé au contrôle vient souvent s’additionner au coût
de la simulation, appliquer un contrôle sur une simulation à plus haute résolution rallonge
chaque itération du processus de création. Afin d’accélérer ce processus, l’édition peut se
faire sur une simulation basse résolution moins coûteuse. Nous pouvons donc considérer que
la création d’un fluide contrôlé peut se diviser en deux phases: une phase de contrôle durant
laquelle un artiste modifie le comportement d’une simulation basse résolution, et une phase
d’augmentation de détail durant laquelle une version haute résolution de cette simulation
est gĂ©nĂ©rĂ©e. Cette thèse prĂ©sente deux projets, chacun contribuant Ă l’état de l’art reliĂ© Ă
chacune de ces deux phases.
Dans un premier temps, on introduit un nouveau système de contrôle de liquide représenté
par un modèle particulaire. À l’aide de ce système, un artiste peut sélectionner dans une base
de données une parcelle de liquide animé précalculée. Cette parcelle peut ensuite être placée
dans une simulation afin d’en modifier son comportement. À chaque pas de simulation, notre
système utilise la liste de parcelles actives afin de reproduire localement la vision de l’artiste.
Une interface graphique intuitive a été développée, inspirée par les logiciels de montage vidéo,
et permettant Ă un utilisateur non expert de simplement Ă©diter une simulation de liquide.
Dans un second temps, une méthode d’augmentation de détail est décrite. Nous proposons
d’ajouter une étape supplémentaire de suivi après l’étape de projection du champ de
vitesse d’une simulation de fumée eulérienne classique. Durant cette étape, un champ de
perturbations de vitesse non-divergent est calculé, résultant en une meilleure correspondance
des densités à haute et à basse résolution. L’animation de fumée résultante reproduit fidèlement
l’aspect grossier de la simulation d’entrée, tout en étant augmentée à l’aide de détails
simulés.Physics-based animation can generate dynamic systems of very complex and realistic behaviors.
Unfortunately, controlling them is a daunting task. In particular, fluid simulation
brings up particularly difficult problems to the control process. Although many methods
and tools have been developed to convincingly simulate and render fluids, too few methods
provide efficient and intuitive control over a simulation. Since control often comes with extra
computations on top of the simulation cost, art-directing a high-resolution simulation leads
to long iterations of the creative process. In order to shorten this process, editing could be
performed on a faster, low-resolution model. Therefore, we can consider that the process of
generating an art-directed fluid could be split into two stages: a control stage during which
an artist modifies the behavior of a low-resolution simulation, and an upresolution stage
during which a final high-resolution version of this simulation is driven. This thesis presents
two projects, each one improving on the state of the art related to each of these two stages.
First, we introduce a new particle-based liquid control system. Using this system, an
artist selects patches of precomputed liquid animations from a database, and places them in
a simulation to modify its behavior. At each simulation time step, our system uses these entities
to control the simulation in order to reproduce the artist’s vision. An intuitive graphical
user interface inspired by video editing tools has been developed, allowing a nontechnical
user to simply edit a liquid animation.
Second, a tracking solution for smoke upresolution is described. We propose to add an
extra tracking step after the projection of a classical Eulerian smoke simulation. During
this step, we solve for a divergence-free velocity perturbation field resulting in a better
matching of the low-frequency density distribution between the low-resolution guide and the
high-resolution simulation. The resulting smoke animation faithfully reproduces the coarse
aspect of the low-resolution input, while being enhanced with simulated small-scale details
Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios
Event cameras are bio-inspired vision sensors that output pixel-level
brightness changes instead of standard intensity frames. These cameras do not
suffer from motion blur and have a very high dynamic range, which enables them
to provide reliable visual information during high speed motions or in scenes
characterized by high dynamic range. However, event cameras output only little
information when the amount of motion is limited, such as in the case of almost
still motion. Conversely, standard cameras provide instant and rich information
about the environment most of the time (in low-speed and good lighting
scenarios), but they fail severely in case of fast motions, or difficult
lighting such as high dynamic range or low light scenes. In this paper, we
present the first state estimation pipeline that leverages the complementary
advantages of these two sensors by fusing in a tightly-coupled manner events,
standard frames, and inertial measurements. We show on the publicly available
Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement
of 130% over event-only pipelines, and 85% over standard-frames-only
visual-inertial systems, while still being computationally tractable.
Furthermore, we use our pipeline to demonstrate - to the best of our knowledge
- the first autonomous quadrotor flight using an event camera for state
estimation, unlocking flight scenarios that were not reachable with traditional
visual-inertial odometry, such as low-light environments and high-dynamic range
scenes.Comment: 8 pages, 9 figures, 2 table
- …