4,372 research outputs found

    HeadOn: Real-time Reenactment of Human Portrait Videos

    Get PDF
    We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at Siggraph'1

    Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

    Full text link
    We address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body. Since the problem is heavily under-constrained, previous methods either use a large number of sensors, which is intrusive, or they require additional video input. We take a different approach and constrain the problem by: (i) making use of a realistic statistical body model that includes anthropometric constraints and (ii) using a joint optimization framework to fit the model to orientation and acceleration measurements over multiple frames. The resulting tracker Sparse Inertial Poser (SIP) enables 3D human pose estimation using only 6 sensors (attached to the wrists, lower legs, back and head) and works for arbitrary human motions. Experiments on the recently released TNT15 dataset show that, using the same number of sensors, SIP achieves higher accuracy than the dataset baseline without using any video data. We further demonstrate the effectiveness of SIP on newly recorded challenging motions in outdoor scenarios such as climbing or jumping over a wall.Comment: 12 pages, Accepted at Eurographics 201

    Advanced tracking and image registration techniques for intraoperative radiation therapy

    Get PDF
    Mención Internacional en el título de doctorIntraoperative electron radiation therapy (IOERT) is a technique used to deliver radiation to the surgically opened tumor bed without irradiating healthy tissue. Treatment planning systems and mobile linear accelerators enable clinicians to optimize the procedure, minimize stress in the operating room (OR) and avoid transferring the patient to a dedicated radiation room. However, placement of the radiation collimator over the tumor bed requires a validation methodology to ensure correct delivery of the dose prescribed in the treatment planning system. In this dissertation, we address three well-known limitations of IOERT: applicator positioning over the tumor bed, docking of the mobile linear accelerator gantry with the applicator and validation of the dose delivery prescribed. This thesis demonstrates that these limitations can be overcome by positioning the applicator appropriately with respect to the patient’s anatomy. The main objective of the study was to assess technological and procedural alternatives for improvement of IOERT performance and resolution of problems of uncertainty. Image-to-world registration, multicamera optical trackers, multimodal imaging techniques and mobile linear accelerator docking are addressed in the context of IOERT. IOERT is carried out by a multidisciplinary team in a highly complex environment that has special tracking needs owing to the characteristics of its working volume (i.e., large and prone to occlusions), in addition to the requisites of accuracy. The first part of this dissertation presents the validation of a commercial multicamera optical tracker in terms of accuracy, sensitivity to miscalibration, camera occlusions and detection of tools using a feasible surgical setup. It also proposes an automatic miscalibration detection protocol that satisfies the IOERT requirements of automaticity and speed. We show that the multicamera tracker is suitable for IOERT navigation and demonstrate the feasibility of the miscalibration detection protocol in clinical setups. Image-to-world registration is one of the main issues during image-guided applications where the field of interest and/or the number of possible anatomical localizations is large, such as IOERT. In the second part of this dissertation, a registration algorithm for image-guided surgery based on lineshaped fiducials (line-based registration) is proposed and validated. Line-based registration decreases acquisition time during surgery and enables better registration accuracy than other published algorithms. In the third part of this dissertation, we integrate a commercial low-cost ultrasound transducer and a cone beam CT C-arm with an optical tracker for image-guided interventions to enable surgical navigation and explore image based registration techniques for both modalities. In the fourth part of the dissertation, a navigation system based on optical tracking for the docking of the mobile linear accelerator to the radiation applicator is assessed. This system improves safety and reduces procedure time. The system tracks the prescribed collimator location to solve the movements that the linear accelerator should perform to reach the docking position and warns the user about potentially unachievable arrangements before the actual procedure. A software application was implemented to use this system in the OR, where it was also evaluated to assess the improvement in docking speed. Finally, in the last part of the dissertation, we present and assess the installation setup for a navigation system in a dedicated IOERT OR, determine the steps necessary for the IOERT process, identify workflow limitations and evaluate the feasibility of the integration of the system in a real OR. The navigation system safeguards the sterile conditions of the OR, clears the space available for surgeons and is suitable for any similar dedicated IOERT OR.La Radioterapia Intraoperatoria por electrones (RIO) consiste en la aplicación de radiación de alta energía directamente sobre el lecho tumoral, accesible durante la cirugía, evitando radiar los tejidos sanos. Hoy en día, avances como los sistemas de planificación (TPS) y la aparición de aceleradores lineales móviles permiten optimizar el procedimiento, minimizar el estrés clínico en el entorno quirúrgico y evitar el desplazamiento del paciente durante la cirugía a otra sala para ser radiado. La aplicación de la radiación se realiza mediante un colimador del haz de radiación (aplicador) que se coloca sobre el lecho tumoral de forma manual por el oncólogo radioterápico. Sin embargo, para asegurar una correcta deposición de la dosis prescrita y planificada en el TPS, es necesaria una adecuada validación de la colocación del colimador. En esta Tesis se abordan tres limitaciones conocidas del procedimiento RIO: el correcto posicionamiento del aplicador sobre el lecho tumoral, acoplamiento del acelerador lineal con el aplicador y validación de la dosis de radiación prescrita. Esta Tesis demuestra que estas limitaciones pueden ser abordadas mediante el posicionamiento del aplicador de radiación en relación con la anatomía del paciente. El objetivo principal de este trabajo es la evaluación de alternativas tecnológicas y procedimentales para la mejora de la práctica de la RIO y resolver los problemas de incertidumbre descritos anteriormente. Concretamente se revisan en el contexto de la radioterapia intraoperatoria los siguientes temas: el registro de la imagen y el paciente, sistemas de posicionamiento multicámara, técnicas de imagen multimodal y el acoplamiento del acelerador lineal móvil. El entorno complejo y multidisciplinar de la RIO precisa de necesidades especiales para el empleo de sistemas de posicionamiento como una alta precisión y un volumen de trabajo grande y propenso a las oclusiones de los sensores de posición. La primera parte de esta Tesis presenta una exhaustiva evaluación de un sistema de posicionamiento óptico multicámara comercial. Estudiamos la precisión del sistema, su sensibilidad a errores cometidos en la calibración, robustez frente a posibles oclusiones de las cámaras y precisión en el seguimiento de herramientas en un entorno quirúrgico real. Además, proponemos un protocolo para la detección automática de errores por calibración que satisface los requisitos de automaticidad y velocidad para la RIO demostrando la viabilidad del empleo de este sistema para la navegación en RIO. Uno de los problemas principales de la cirugía guiada por imagen es el correcto registro de la imagen médica y la anatomía del paciente en el quirófano. En el caso de la RIO, donde el número de posibles localizaciones anatómicas es bastante amplio, así como el campo de trabajo es grande se hace necesario abordar este problema para una correcta navegación. Por ello, en la segunda parte de esta Tesis, proponemos y validamos un nuevo algoritmo de registro (LBR) para la cirugía guiada por imagen basado en marcadores lineales. El método propuesto reduce el tiempo de la adquisición de la posición de los marcadores durante la cirugía y supera en precisión a otros algoritmos de registro establecidos y estudiados en la literatura. En la tercera parte de esta tesis, integramos un transductor de ultrasonido comercial de bajo coste, un arco en C de rayos X con haz cónico y un sistema de posicionamiento óptico para intervenciones guiadas por imagen que permite la navegación quirúrgica y exploramos técnicas de registro de imagen para ambas modalidades. En la cuarta parte de esta tesis se evalúa un navegador basado en el sistema de posicionamiento óptico para el acoplamiento del acelerador lineal móvil con aplicador de radiación, mejorando la seguridad y reduciendo el tiempo del propio acoplamiento. El sistema es capaz de localizar el colimador en el espacio y proporcionar los movimientos que el acelerador lineal debe realizar para alcanzar la posición de acoplamiento. El sistema propuesto es capaz de advertir al usuario de aquellos casos donde la posición de acoplamiento sea inalcanzable. El sistema propuesto de ayuda para el acoplamiento se integró en una aplicación software que fue evaluada para su uso final en quirófano demostrando su viabilidad y la reducción de tiempo de acoplamiento mediante su uso. Por último, presentamos y evaluamos la instalación de un sistema de navegación en un quirófano RIO dedicado, determinamos las necesidades desde el punto de vista procedimental, identificamos las limitaciones en el flujo de trabajo y evaluamos la viabilidad de la integración del sistema en un entorno quirúrgico real. El sistema propuesto demuestra ser apto para el entorno RIO manteniendo las condiciones de esterilidad y dejando despejado el campo quirúrgico además de ser adaptable a cualquier quirófano similar.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Raúl San José Estépar.- Secretario: María Arrate Muñoz Barrutia.- Vocal: Carlos Ferrer Albiac

    Autonomous model building using vision and manipulation

    Get PDF
    It is often the case that robotic systems require models, in order to successfully control themselves, and to interact with the world. Models take many forms and include kinematic models to plan motions, dynamics models to understand the interaction of forces, and models of 3D geometry to check for collisions, to name but a few. Traditionally, models are provided to the robotic system by the designers that build the system. However, for long-term autonomy it becomes important for the robot to be able to build and maintain models of itself, and of objects it might encounter. In this thesis, the argument for enabling robotic systems to autonomously build models is advanced and explored. The main contribution of this research is to show how a layered approach can be taken to building models. Thus a robot, starting with a limited amount of information, can autonomously build a number of models, including a kinematic model, which describes the robot’s body, and allows it to plan and perform future movements. Key to the incremental, autonomous approach is the use of exploratory actions. These are actions that the robot can perform in order to gain some more information, either about itself, or about an object with which it is interacting. A method is then presented whereby a robot, after being powered on, can home its joints using just vision, i.e. traditional methods such as absolute encoders, or limit switches are not required. The ability to interact with objects in order to extract information is one of the main advantages that a robotic system has over a purely passive system, when attempting to learn about or build models of objects. In light of this, the next contribution of this research is to look beyond the robot’s body and to present methods with which a robot can autonomously build models of objects in the world around it. The first class of objects examined are flat pack cardboard boxes, a class of articulated objects with a number of interesting properties. It is shown how exploratory actions can be used to build a model of a flat pack cardboard box and to locate any hinges the box may have. Specifically, it is shown how when interacting with an object, a robot can combine haptic feedback from force sensors, with visual feedback from a camera to get more information from an object than would be possible using just a single sensor modality. The final contribution of this research is to present a series of exploratory actions for a robotic text reading system that allow text to be found and read from an object. The text reading system highlights how models of objects can take many forms, from a representation of their physical extents, to the text that is written on them

    Incremental low rank noise reduction for robust infrared tracking of body temperature during medical imaging

    Get PDF
    Thermal imagery for monitoring of body temperature provides a powerful tool to decrease health risks (e.g., burning) for patients during medical imaging (e.g., magnetic resonance imaging). The presented approach discusses an experiment to simulate radiology conditions with infrared imaging along with an automatic thermal monitoring/tracking system. The thermal tracking system uses an incremental low-rank noise reduction applying incremental singular value decomposition (SVD) and applies color based clustering for initialization of the region of interest (ROI) boundary. Then a particle filter tracks the ROI(s) from the entire thermal stream (video sequence). The thermal database contains 15 subjects in two positions (i.e., sitting, and lying) in front of thermal camera. This dataset is created to verify the robustness of our method with respect to motion-artifacts and in presence of additive noise (2–20%—salt and pepper noise). The proposed approach was tested for the infrared images in the dataset and was able to successfully measure and track the ROI continuously (100% detecting and tracking the temperature of participants), and provided considerable robustness against noise (unchanged accuracy even in 20% additive noise), which shows promising performanc

    Occlusion-Aware Multi-View Reconstruction of Articulated Objects for Manipulation

    Get PDF
    The goal of this research is to develop algorithms using multiple views to automatically recover complete 3D models of articulated objects in unstructured environments and thereby enable a robotic system to facilitate further manipulation of those objects. First, an algorithm called Procrustes-Lo-RANSAC (PLR) is presented. Structure-from-motion techniques are used to capture 3D point cloud models of an articulated object in two different configurations. Procrustes analysis, combined with a locally optimized RANSAC sampling strategy, facilitates a straightforward geometric approach to recovering the joint axes, as well as classifying them automatically as either revolute or prismatic. The algorithm does not require prior knowledge of the object, nor does it make any assumptions about the planarity of the object or scene. Second, with such a resulting articulated model, a robotic system is then able to manipulate the object either along its joint axes at a specified grasp point in order to exercise its degrees of freedom or move its end effector to a particular position even if the point is not visible in the current view. This is one of the main advantages of the occlusion-aware approach, because the models capture all sides of the object meaning that the robot has knowledge of parts of the object that are not visible in the current view. Experiments with a PUMA 500 robotic arm demonstrate the effectiveness of the approach on a variety of real-world objects containing both revolute and prismatic joints. Third, we improve the proposed approach by using a RGBD sensor (Microsoft Kinect) that yield a depth value for each pixel immediately by the sensor itself rather than requiring correspondence to establish depth. KinectFusion algorithm is applied to produce a single high-quality, geometrically accurate 3D model from which rigid links of the object are segmented and aligned, allowing the joint axes to be estimated using the geometric approach. The improved algorithm does not require artificial markers attached to objects, yields much denser 3D models and reduces the computation time
    corecore