59 research outputs found

    Flight Dynamics-based Recovery of a UAV Trajectory using Ground Cameras

    Get PDF
    We propose a new method to estimate the 6-dof trajectory of a flying object such as a quadrotor UAV within a 3D airspace monitored using multiple fixed ground cameras. It is based on a new structure from motion formulation for the 3D reconstruction of a single moving point with known motion dynamics. Our main contribution is a new bundle adjustment procedure which in addition to optimizing the camera poses, regularizes the point trajectory using a prior based on motion dynamics (or specifically flight dynamics). Furthermore, we can infer the underlying control input sent to the UAV's autopilot that determined its flight trajectory. Our method requires neither perfect single-view tracking nor appearance matching across views. For robustness, we allow the tracker to generate multiple detections per frame in each video. The true detections and the data association across videos is estimated using robust multi-view triangulation and subsequently refined during our bundle adjustment procedure. Quantitative evaluation on simulated data and experiments on real videos from indoor and outdoor scenes demonstrates the effectiveness of our method

    Reconstruction of the pose of uncalibrated cameras via user-generated videos

    Get PDF
    Extraction of 3D geometry from hand-held unsteady uncalibrated cameras faces multiple difficulties: finding usable frames, feature-matching and unknown variable focal length to name three. We have built a prototype system to allow a user to spatially navigate playback viewpoints of an event of interest, using geometry automatically recovered from casually captured videos. The system, whose workings we present in this paper, necessarily estimates not only scene geometry, but also relative viewpoint position, overcoming the mentioned difficulties in the process. The only inputs required are video sequences from various viewpoints of a common scene, as are readily available online from sporting and music events. Our methods make no assumption of the synchronization of the input and do not require file metadata, instead exploiting the video to self-calibrate. The footage need only contain some camera rotation with little translation—for hand-held event footage a likely occurrence.This is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1145/2659021.265902

    Probabilistic Triangulation for Uncalibrated Multi-View 3D Human Pose Estimation

    Full text link
    3D human pose estimation has been a long-standing challenge in computer vision and graphics, where multi-view methods have significantly progressed but are limited by the tedious calibration processes. Existing multi-view methods are restricted to fixed camera pose and therefore lack generalization ability. This paper presents a novel Probabilistic Triangulation module that can be embedded in a calibrated 3D human pose estimation method, generalizing it to uncalibration scenes. The key idea is to use a probability distribution to model the camera pose and iteratively update the distribution from 2D features instead of using camera pose. Specifically, We maintain a camera pose distribution and then iteratively update this distribution by computing the posterior probability of the camera pose through Monte Carlo sampling. This way, the gradients can be directly back-propagated from the 3D pose estimation to the 2D heatmap, enabling end-to-end training. Extensive experiments on Human3.6M and CMU Panoptic demonstrate that our method outperforms other uncalibration methods and achieves comparable results with state-of-the-art calibration methods. Thus, our method achieves a trade-off between estimation accuracy and generalizability. Our code is in https://github.com/bymaths/probabilistic_triangulationComment: 9pages, 5figures, conferenc

    SmartMocap: Joint Estimation of Human and Camera Motion using Uncalibrated RGB Cameras

    Full text link
    Markerless human motion capture (mocap) from multiple RGB cameras is a widely studied problem. Existing methods either need calibrated cameras or calibrate them relative to a static camera, which acts as the reference frame for the mocap system. The calibration step has to be done a priori for every capture session, which is a tedious process, and re-calibration is required whenever cameras are intentionally or accidentally moved. In this paper, we propose a mocap method which uses multiple static and moving extrinsically uncalibrated RGB cameras. The key components of our method are as follows. First, since the cameras and the subject can move freely, we select the ground plane as a common reference to represent both the body and the camera motions unlike existing methods which represent bodies in the camera coordinate. Second, we learn a probability distribution of short human motion sequences (∌\sim1sec) relative to the ground plane and leverage it to disambiguate between the camera and human motion. Third, we use this distribution as a motion prior in a novel multi-stage optimization approach to fit the SMPL human body model and the camera poses to the human body keypoints on the images. Finally, we show that our method can work on a variety of datasets ranging from aerial cameras to smartphones. It also gives more accurate results compared to the state-of-the-art on the task of monocular human mocap with a static camera. Our code is available for research purposes on https://github.com/robot-perception-group/SmartMocap

    Camera Network Calibration and Synchronization from Silhouettes in Archived Video

    Get PDF
    In this paper we present an automatic method for calibrating a network of cameras that works by analyzing only the motion of silhouettes in the multiple video streams. This is particularly useful for automatic reconstruction of a dynamic event using a camera network in a situation where precalibration of the cameras is impractical or even impossible. The key contribution of this work is a RANSAC-based algorithm that simultaneously computes the epipolar geometry and synchronization of a pair of cameras only from the motion of silhouettes in video. Our approach involves first independently computing the fundamental matrix and synchronization for multiple pairs of cameras in the network. In the next stage the calibration and synchronization for the complete network is recovered from the pairwise information. Finally, a visual-hull algorithm is used to reconstruct the shape of the dynamic object from its silhouettes in video. For unsynchronized video streams with sub-frame temporal offsets, we interpolate silhouettes between successive frames to get more accurate visual hulls. We show the effectiveness of our method by remotely calibrating several different indoor camera networks from archived video streams

    Raum-Zeit Interpolationstechniken

    Get PDF
    The photo-realistic modeling and animation of complex scenes in 3D requires a lot of work and skill of artists even with modern acquisition techniques. This is especially true if the rendering should additionally be performed in real-time. In this thesis we follow another direction in computer graphics to generate photo-realistic results based on recorded video sequences of one or multiple cameras. We propose several methods to handle scenes showing natural phenomena and also multi-view footage of general complex 3D scenes. In contrast to other approaches, we make use of relaxed geometric constraints and focus especially on image properties important to create perceptually plausible in-between images. The results are novel photo-realistic video sequences rendered in real-time allowing for interactive manipulation or to interactively explore novel view and time points.Das Modellieren und die Animation von 3D Szenen in fotorealistischer QualitĂ€t ist sehr arbeitsaufwĂ€ndig, auch wenn moderne Verfahren benutzt werden. Wenn die Bilder in Echtzeit berechnet werden sollen ist diese Aufgabe um so schwieriger zu lösen. In dieser Dissertation verfolgen wir einen alternativen Ansatz der Computergrafik, um neue photorealistische Ergebnisse aus einer oder mehreren aufgenommenen Videosequenzen zu gewinnen. Es werden mehrere Methoden entwickelt die fĂŒr natĂŒrlicher PhĂ€nomene und fĂŒr generelle Szenen einsetzbar sind. Im Unterschied zu anderen Verfahren nutzen wir abgeschwĂ€chte geometrische EinschrĂ€nkungen und berechnen eine genaue Lösung nur dort wo sie wichtig fĂŒr die menschliche Wahrnehmung ist. Die Ergebnisse sind neue fotorealistische Videosequenzen, die in Echtzeit berechnet und interaktiv manipuliert, oder in denen neue Blick- und Zeitpunkte der Szenen frei erkundet werden können
    • 

    corecore