13,320 research outputs found

    LiveCap: Real-time Human Performance Capture from Monocular Video

    Full text link
    We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    Towards retrieving force feedback in robotic-assisted surgery: a supervised neuro-recurrent-vision approach

    Get PDF
    Robotic-assisted minimally invasive surgeries have gained a lot of popularity over conventional procedures as they offer many benefits to both surgeons and patients. Nonetheless, they still suffer from some limitations that affect their outcome. One of them is the lack of force feedback which restricts the surgeon's sense of touch and might reduce precision during a procedure. To overcome this limitation, we propose a novel force estimation approach that combines a vision based solution with supervised learning to estimate the applied force and provide the surgeon with a suitable representation of it. The proposed solution starts with extracting the geometry of motion of the heart's surface by minimizing an energy functional to recover its 3D deformable structure. A deep network, based on a LSTM-RNN architecture, is then used to learn the relationship between the extracted visual-geometric information and the applied force, and to find accurate mapping between the two. Our proposed force estimation solution avoids the drawbacks usually associated with force sensing devices, such as biocompatibility and integration issues. We evaluate our approach on phantom and realistic tissues in which we report an average root-mean square error of 0.02 N.Peer ReviewedPostprint (author's final draft

    Evaluation of trackers for Pan-Tilt-Zoom Scenarios

    Full text link
    Tracking with a Pan-Tilt-Zoom (PTZ) camera has been a research topic in computer vision for many years. Compared to tracking with a still camera, the images captured with a PTZ camera are highly dynamic in nature because the camera can perform large motion resulting in quickly changing capture conditions. Furthermore, tracking with a PTZ camera involves camera control to position the camera on the target. For successful tracking and camera control, the tracker must be fast enough, or has to be able to predict accurately the next position of the target. Therefore, standard benchmarks do not allow to assess properly the quality of a tracker for the PTZ scenario. In this work, we use a virtual PTZ framework to evaluate different tracking algorithms and compare their performances. We also extend the framework to add target position prediction for the next frame, accounting for camera motion and processing delays. By doing this, we can assess if predicting can make long-term tracking more robust as it may help slower algorithms for keeping the target in the field of view of the camera. Results confirm that both speed and robustness are required for tracking under the PTZ scenario.Comment: 6 pages, 2 figures, International Conference on Pattern Recognition and Artificial Intelligence 201

    Depth estimation of optically transparent laser-driven microrobots

    Get PDF
    Six degree-of-freedom (DoF) pose feedback is essential for the development of closed-loop control techniques for microrobotics. This paper presents two methods for depth estimation of transparent microrobots inside an Optical Tweezers (OT) setup using image sharpness measurements and model-based tracking. The x-y position and the 3D orientation of the object are estimated using online model-based template matching. The proposed depth estimation methodologies are validated experimentally by comparing the results with the ground truth

    Tools for fluid simulation control in computer graphics

    Full text link
    L’animation basée sur la physique peut générer des systèmes aux comportements complexes et réalistes. Malheureusement, contrôler de tels systèmes est une tâche ardue. Dans le cas de la simulation de fluide, le processus de contrôle est particulièrement complexe. Bien que de nombreuses méthodes et outils ont été mis au point pour simuler et faire le rendu de fluides, trop peu de méthodes offrent un contrôle efficace et intuitif sur une simulation de fluide. Étant donné que le coût associé au contrôle vient souvent s’additionner au coût de la simulation, appliquer un contrôle sur une simulation à plus haute résolution rallonge chaque itération du processus de création. Afin d’accélérer ce processus, l’édition peut se faire sur une simulation basse résolution moins coûteuse. Nous pouvons donc considérer que la création d’un fluide contrôlé peut se diviser en deux phases: une phase de contrôle durant laquelle un artiste modifie le comportement d’une simulation basse résolution, et une phase d’augmentation de détail durant laquelle une version haute résolution de cette simulation est générée. Cette thèse présente deux projets, chacun contribuant à l’état de l’art relié à chacune de ces deux phases. Dans un premier temps, on introduit un nouveau système de contrôle de liquide représenté par un modèle particulaire. À l’aide de ce système, un artiste peut sélectionner dans une base de données une parcelle de liquide animé précalculée. Cette parcelle peut ensuite être placée dans une simulation afin d’en modifier son comportement. À chaque pas de simulation, notre système utilise la liste de parcelles actives afin de reproduire localement la vision de l’artiste. Une interface graphique intuitive a été développée, inspirée par les logiciels de montage vidéo, et permettant à un utilisateur non expert de simplement éditer une simulation de liquide. Dans un second temps, une méthode d’augmentation de détail est décrite. Nous proposons d’ajouter une étape supplémentaire de suivi après l’étape de projection du champ de vitesse d’une simulation de fumée eulérienne classique. Durant cette étape, un champ de perturbations de vitesse non-divergent est calculé, résultant en une meilleure correspondance des densités à haute et à basse résolution. L’animation de fumée résultante reproduit fidèlement l’aspect grossier de la simulation d’entrée, tout en étant augmentée à l’aide de détails simulés.Physics-based animation can generate dynamic systems of very complex and realistic behaviors. Unfortunately, controlling them is a daunting task. In particular, fluid simulation brings up particularly difficult problems to the control process. Although many methods and tools have been developed to convincingly simulate and render fluids, too few methods provide efficient and intuitive control over a simulation. Since control often comes with extra computations on top of the simulation cost, art-directing a high-resolution simulation leads to long iterations of the creative process. In order to shorten this process, editing could be performed on a faster, low-resolution model. Therefore, we can consider that the process of generating an art-directed fluid could be split into two stages: a control stage during which an artist modifies the behavior of a low-resolution simulation, and an upresolution stage during which a final high-resolution version of this simulation is driven. This thesis presents two projects, each one improving on the state of the art related to each of these two stages. First, we introduce a new particle-based liquid control system. Using this system, an artist selects patches of precomputed liquid animations from a database, and places them in a simulation to modify its behavior. At each simulation time step, our system uses these entities to control the simulation in order to reproduce the artist’s vision. An intuitive graphical user interface inspired by video editing tools has been developed, allowing a nontechnical user to simply edit a liquid animation. Second, a tracking solution for smoke upresolution is described. We propose to add an extra tracking step after the projection of a classical Eulerian smoke simulation. During this step, we solve for a divergence-free velocity perturbation field resulting in a better matching of the low-frequency density distribution between the low-resolution guide and the high-resolution simulation. The resulting smoke animation faithfully reproduces the coarse aspect of the low-resolution input, while being enhanced with simulated small-scale details

    Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed Scenarios

    Full text link
    Event cameras are bio-inspired vision sensors that output pixel-level brightness changes instead of standard intensity frames. These cameras do not suffer from motion blur and have a very high dynamic range, which enables them to provide reliable visual information during high speed motions or in scenes characterized by high dynamic range. However, event cameras output only little information when the amount of motion is limited, such as in the case of almost still motion. Conversely, standard cameras provide instant and rich information about the environment most of the time (in low-speed and good lighting scenarios), but they fail severely in case of fast motions, or difficult lighting such as high dynamic range or low light scenes. In this paper, we present the first state estimation pipeline that leverages the complementary advantages of these two sensors by fusing in a tightly-coupled manner events, standard frames, and inertial measurements. We show on the publicly available Event Camera Dataset that our hybrid pipeline leads to an accuracy improvement of 130% over event-only pipelines, and 85% over standard-frames-only visual-inertial systems, while still being computationally tractable. Furthermore, we use our pipeline to demonstrate - to the best of our knowledge - the first autonomous quadrotor flight using an event camera for state estimation, unlocking flight scenarios that were not reachable with traditional visual-inertial odometry, such as low-light environments and high-dynamic range scenes.Comment: 8 pages, 9 figures, 2 table
    • …
    corecore