3,162 research outputs found

    Image enhancement from a stabilised video sequence

    Get PDF
    The aim of video stabilisation is to create a new video sequence where the motions (i.e. rotations, translations) and scale differences between frames (or parts of a frame) have effectively been removed. These stabilisation effects can be obtained via digital video processing techniques which use the information extracted from the video sequence itself, with no need for additional hardware or knowledge about camera physical motion. A video sequence usually contains a large overlap between successive frames, and regions of the same scene are sampled at different positions. In this paper, this multiple sampling is combined to achieve images with a higher spatial resolution. Higher resolution imagery play an important role in assisting in the identification of people, vehicles, structures or objects of interest captured by surveillance cameras or by video cameras used in face recognition, traffic monitoring, traffic law reinforcement, driver assistance and automatic vehicle guidance systems

    07171 Abstracts Collection -- Visual Computing -- Convergence of Computer Graphics and Computer Vision

    Get PDF
    From 22.04. to 27.04.2007, the Dagstuhl Seminar 07171 ``Visual Computing - Convergence of Computer Graphics and Computer Vision\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Dense Vision in Image-guided Surgery

    Get PDF
    Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately. The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes. Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces

    Minimizing the Multi-view Stereo Reprojection Error for Triangular Surface Meshes

    Get PDF
    International audienceThis article proposes a variational multi-view stereo vision method based on meshes for recovering 3D scenes (shape and radiance) from images. Our method is based on generative models and minimizes the reprojection error (difference between the observed images and the images synthesized from the reconstruction). Our contributions are twofold. 1) For the first time, we rigorously compute the gradient of the reprojection error for non smooth surfaces defined by discrete triangular meshes. The gradient correctly takes into account the visibility changes that occur when a surface moves; this forces the contours generated by the reconstructed surface to perfectly match with the apparent contours in the input images. 2) We propose an original modification of the Lambertian model to take into account deviations from the constant brightness assumption without explicitly modelling the reflectance properties of the scene or other photometric phenomena involved by the camera model. Our method is thus able to recover the shape and the diffuse radiance of non Lambertian scenes

    Modelling 3D humans : pose, shape, clothing and interactions

    Get PDF
    Digital humans are increasingly becoming a part of our lives with applications like animation, gaming, virtual try-on, Metaverse and much more. In recent years there has been a great push to make our models of digital humans as real as possible. In this thesis we present methodologies to model two key characteristics of real humans, their appearance and actions. This thesis covers four innovations: (i) MGN, the first approach to reconstruct 3D garments and body shape underneath, as separate meshes, from a few RGB images of a person. This allows, for the first time, real world applications like texture transfer, garment transfer and virtual try-on in 3D, using just images. (ii) IPNet, a neural network, that leverages implicit functions for detailed reconstruction and registers the reconstructed mesh with the parametric SMPL model to make it controllable for real world tasks like animation and editing. (iii) LoopReg, a novel formulation that makes 3D registration task end-to-end differentiable for the first time. Semi-supervised LoopReg outperforms contemporary supervised methods using ∌100x less supervised data. (iv) BEHAVE the first dataset and method to track full body real interactions between humans and movable objects. All our code, MGN digital wardrobe and BEHAVE dataset are publicly available for further research.Digital humans are increasingly becoming a part of our lives with applications like animation, gaming, virtual try-on, Metaverse and much more. In recent years there has been a great push to make our models of digital humans as real as possible. In this thesis we present methodologies to model two key characteristics of real humans, their appearance and actions. This thesis covers four innovations: (i) MGN, the first approach to reconstruct 3D garments and body shape underneath, as separate meshes, from a few RGB images of a person. This allows, for the first time, real world applications like texture transfer, garment transfer and virtual try-on in 3D, using just images. (ii) IPNet, a neural network, that leverages implicit functions for detailed reconstruction and registers the reconstructed mesh with the parametric SMPL model to make it controllable for real world tasks like animation and editing. (iii) LoopReg, a novel formulation that makes 3D registration task end-to-end differentiable for the first time. Semi-supervised LoopReg outperforms contemporary supervised methods using ∌100x less supervised data. (iv) BEHAVE the first dataset and method to track full body real interactions between humans and movable objects. All our code, MGN digital wardrobe and BEHAVE dataset are publicly available for further research.Der digitale Mensch wird immer mehr zu einem Teil unseres Lebens mit Anwendungen wie Animation, Spielen, virtuellem Ausprobieren, Metaverse und vielem mehr. In den letzten Jahren wurden große Anstrengungen unternommen, um unsere Modelle digitaler Menschen so real wie möglich zu gestalten. In dieser Arbeit stellen wir Methoden zur Modellierung von zwei SchlĂŒsseleigenschaften echter Menschen vor: ihr Aussehen und ihre Handlungen. Wir schlagen MGN vor, den ersten Ansatz zur Rekonstruktion von 3D-KleidungsstĂŒcken und der darunter liegenden Körperform als separate Netze aus einigen wenigen RGB-Bildern einer Person. Wir erweitern das weit verbreitete SMPL-Körpermodell, das nur unbekleidete Formen darstellt, um auch KleidungsstĂŒcke zu erfassen (SMPL+G). SMPL+G kann mit KleidungsstĂŒcken bekleidet werden, die entsprechend dem SMPL-Modell posiert und geformt werden können. Dies ermöglicht zum ersten Mal reale Anwendungen wie TexturĂŒbertragung, KleidungsĂŒbertragung und virtuelle Anprobe in 3D, wobei nur Bilder verwendet werden. Wir unterstreichen auch die entscheidende EinschrĂ€nkung der netzbasierten Darstellung fĂŒr digitale Menschen, nĂ€mlich die FĂ€higkeit, hochfrequente Details darzustellen. Daher untersuchen wir die neue implizite funktionsbasierte Darstellung als Alternative zur netzbasierten Darstellung (einschließlich parametrischer Modelle wie SMPL) fĂŒr digitale Menschen. Typischerweise mangelt es den Methoden, die auf letzteren basieren, an Details, wĂ€hrend ersteren die Kontrolle fehlt. Wir schlagen IPNet vor, ein neuronales Netzwerk, das implizite Funktionen fĂŒr eine detaillierte Rekonstruktion nutzt und das rekonstruierte Netz mit dem parametrischen SMPL-Modell registriert, um es kontrollierbar zu machen. Auf diese Weise wird das Beste aus beiden Welten genutzt. Wir untersuchen den Prozess der Registrierung eines parametrischen Modells, wie z. B. SMPL, auf ein 3D-Netz. Dieses jahrzehntealte Problem im Bereich der Computer Vision und der Graphik erfordert in der Regel einen zweistufigen Prozess: i) Herstellung von Korrespondenzen zwischen dem Modell und dem Netz, und ii) Optimierung des Modells, um den Abstand zwischen den entsprechenden Punkten zu minimieren. Dieser zweistufige Prozess ist nicht durchgĂ€ngig differenzierbar. Wir schlagen LoopReg vor, das eine neue, auf impliziten Funktionen basierende Darstellung des Modells verwendet und die Registrierung differenzierbar macht. Semi-ĂŒberwachtes LoopReg ĂŒbertrifft aktuelle ĂŒberwachte Methoden mit ∌100x weniger ĂŒberwachten Daten. Die Modellierung des menschlichen Aussehens ist notwendig, aber nicht ausreichend, um realistische digitale Menschen zu schaffen. Wir mĂŒssen nicht nur modellieren, wie Menschen aussehen, sondern auch, wie sie mit ihren umgebenden Objekten interagieren. Zu diesem Zweck prĂ€sentieren wir mit BEHAVE den ersten Datensatz von realen Ganzkörper-Interaktionen zwischen Menschen und beweglichen Objekten. Wir stellen segmentierte Multiview-RGBDFrames zusammen mit registrierten SMPL- und Objekt-Fits sowie Kontaktannotationen in 3D zur VerfĂŒgung. Der BEHAVE-Datensatz enthĂ€lt ∌15k Frames und seine Erweiterung enthĂ€lt ∌400k Frames mit Pseudo-Ground-Truth-Annotationen. Unsere BEHAVE-Methode verwendet diesen Datensatz, um ein neuronales Netz zu trainieren, das die Person, das Objekt und die Kontakte zwischen ihnen gemeinsam verfolgt. In dieser Arbeit untersuchen wir die oben genannten Ideen und bieten eine eingehende Analyse unserer SchlĂŒsselideen und Designentscheidungen. Wir erörtern auch die Grenzen unserer Ideen und schlagen kĂŒnftige Arbeiten vor, um nicht nur diese Grenzen anzugehen, sondern auch die Forschung weiter auszubauen. Unser gesamter Code, die digitale Garderobe und der Datensatz sind fĂŒr weitere Forschungen öffentlich zugĂ€nglich

    Joint 3D estimation of vehicles and scene flow

    Get PDF
    driving. While much progress has been made in recent years, imaging conditions in natural outdoor environments are still very challenging for current reconstruction and recognition methods. In this paper, we propose a novel unified approach which reasons jointly about 3D scene flow as well as the pose, shape and motion of vehicles in the scene. Towards this goal, we incorporate a deformable CAD model into a slanted-plane conditional random field for scene flow estimation and enforce shape consistency between the rendered 3D models and the parameters of all superpixels in the image. The association of superpixels to objects is established by an index variable which implicitly enables model selection. We evaluate our approach on the challenging KITTI scene flow dataset in terms of object and scene flow estimation. Our results provide a prove of concept and demonstrate the usefulness of our method. © 2015 Copernicus GmbH. All Rights Reserved

    Background Subtraction in Video Surveillance

    Get PDF
    The aim of thesis is the real-time detection of moving and unconstrained surveillance environments monitored with static cameras. This is achieved based on the results provided by background subtraction. For this task, Gaussian Mixture Models (GMMs) and Kernel density estimation (KDE) are used. A thorough review of state-of-the-art formulations for the use of GMMs and KDE in the task of background subtraction reveals some further development opportunities, which are tackled in a novel GMM-based approach incorporating a variance controlling scheme. The proposed approach method is for parametric and non-parametric and gives us the better method for background subtraction, with more accuracy and easier parametrization of the models, for different environments. It also converges to more accurate models of the scenes. The detection of moving objects is achieved by using the results of background subtraction. For the detection of new static objects, two background models, learning at different rates, are used. This allows for a multi-class pixel classification, which follows the temporality of the changes detected by means of background subtraction. In a first approach, the subtraction of background models is done for parametric model and their results are shown. The second approach is for non-parametric models, where background subtraction is done using KDE non-parametric model. Furthermore, we have done some video engineering, where the background subtraction algorithm was employed so that, the background from one video and the foreground from another video are merged to form a new video. By doing this way, we can also do more complex video engineering with multiple videos. Finally, the results provided by region analysis can be used to improve the quality of the background models, therefore, considerably improving the detection results

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions
    • 

    corecore