17,710 research outputs found

    3D Scene Reconstruction with Micro-Aerial Vehicles and Mobile Devices

    Full text link
    Scene reconstruction is the process of building an accurate geometric model of one\u27s environment from sensor data. We explore the problem of real-time, large-scale 3D scene reconstruction in indoor environments using small laser range-finders and low-cost RGB-D (color plus depth) cameras. We focus on computationally-constrained platforms such as micro-aerial vehicles (MAVs) and mobile devices. These platforms present a set of fundamental challenges - estimating the state and trajectory of the device as it moves within its environment and utilizing lightweight, dynamic data structures to hold the representation of the reconstructed scene. The system needs to be computationally and memory-efficient, so that it can run in real time, onboard the platform. In this work, we present three scene reconstruction systems. The first system uses a laser range-finder and operates onboard a quadrotor MAV. We address the issues of autonomous control, state estimation, path-planning, and teleoperation. We propose the multi-volume occupancy grid (MVOG) - a novel data structure for building 3D maps from laser data, which provides a compact, probabilistic scene representation. The second system uses an RGB-D camera to recover the 6-DoF trajectory of the platform by aligning sparse features observed in the current RGB-D image against a model of previously seen features. We discuss our work on camera calibration and the depth measurement model. We apply the system onboard an MAV to produce occupancy-based 3D maps, which we utilize for path-planning. Finally, we present our contributions to a scene reconstruction system for mobile devices with built-in depth sensing and motion-tracking capabilities. We demonstrate reconstructing and rendering a global mesh on the fly, using only the mobile device\u27s CPU, in very large (300 square meter) scenes, at a resolutions of 2-3cm. To achieve this, we divide the scene into spatial volumes indexed by a hash map. Each volume contains the truncated signed distance function for that area of space, as well as the mesh segment derived from the distance function. This approach allows us to focus computational and memory resources only in areas of the scene which are currently observed, as well as leverage parallelization techniques for multi-core processing

    PlaceRaider: Virtual Theft in Physical Spaces with Smartphones

    Full text link
    As smartphones become more pervasive, they are increasingly targeted by malware. At the same time, each new generation of smartphone features increasingly powerful onboard sensor suites. A new strain of sensor malware has been developing that leverages these sensors to steal information from the physical environment (e.g., researchers have recently demonstrated how malware can listen for spoken credit card numbers through the microphone, or feel keystroke vibrations using the accelerometer). Yet the possibilities of what malware can see through a camera have been understudied. This paper introduces a novel visual malware called PlaceRaider, which allows remote attackers to engage in remote reconnaissance and what we call virtual theft. Through completely opportunistic use of the camera on the phone and other sensors, PlaceRaider constructs rich, three dimensional models of indoor environments. Remote burglars can thus download the physical space, study the environment carefully, and steal virtual objects from the environment (such as financial documents, information on computer monitors, and personally identifiable information). Through two human subject studies we demonstrate the effectiveness of using mobile devices as powerful surveillance and virtual theft platforms, and we suggest several possible defenses against visual malware

    SurfelMeshing: Online Surfel-Based Mesh Reconstruction

    Full text link
    We address the problem of mesh reconstruction from live RGB-D video, assuming a calibrated camera and poses provided externally (e.g., by a SLAM system). In contrast to most existing approaches, we do not fuse depth measurements in a volume but in a dense surfel cloud. We asynchronously (re)triangulate the smoothed surfels to reconstruct a surface mesh. This novel approach enables to maintain a dense surface representation of the scene during SLAM which can quickly adapt to loop closures. This is possible by deforming the surfel cloud and asynchronously remeshing the surface where necessary. The surfel-based representation also naturally supports strongly varying scan resolution. In particular, it reconstructs colors at the input camera's resolution. Moreover, in contrast to many volumetric approaches, ours can reconstruct thin objects since objects do not need to enclose a volume. We demonstrate our approach in a number of experiments, showing that it produces reconstructions that are competitive with the state-of-the-art, and we discuss its advantages and limitations. The algorithm (excluding loop closure functionality) is available as open source at https://github.com/puzzlepaint/surfelmeshing .Comment: Version accepted to IEEE Transactions on Pattern Analysis and Machine Intelligenc

    A Low Cost UWB Based Solution for Direct Georeferencing UAV Photogrammetry

    Get PDF
    Thanks to their flexibility and availability at reduced costs, Unmanned Aerial Vehicles (UAVs) have been recently used on a wide range of applications and conditions. Among these, they can play an important role in monitoring critical events (e.g., disaster monitoring) when the presence of humans close to the scene shall be avoided for safety reasons, in precision farming and surveying. Despite the very large number of possible applications, their usage is mainly limited by the availability of the Global Navigation Satellite System (GNSS) in the considered environment: indeed, GNSS is of fundamental importance in order to reduce positioning error derived by the drift of (low-cost) Micro-Electro-Mechanical Systems (MEMS) internal sensors. In order to make the usage of UAVs possible even in critical environments (when GNSS is not available or not reliable, e.g., close to mountains or in city centers, close to high buildings), this paper considers the use of a low cost Ultra Wide-Band (UWB) system as the positioning method. Furthermore, assuming the use of a calibrated camera, UWB positioning is exploited to achieve metric reconstruction on a local coordinate system. Once the georeferenced position of at least three points (e.g., positions of three UWB devices) is known, then georeferencing can be obtained, as well. The proposed approach is validated on a specific case study, the reconstruction of the façade of a university building. Average error on 90 check points distributed over the building façade, obtained by georeferencing by means of the georeferenced positions of four UWB devices at fixed positions, is 0.29 m. For comparison, the average error obtained by using four ground control points is 0.18 m

    Adaptive User Perspective Rendering for Handheld Augmented Reality

    Full text link
    Handheld Augmented Reality commonly implements some variant of magic lens rendering, which turns only a fraction of the user's real environment into AR while the rest of the environment remains unaffected. Since handheld AR devices are commonly equipped with video see-through capabilities, AR magic lens applications often suffer from spatial distortions, because the AR environment is presented from the perspective of the camera of the mobile device. Recent approaches counteract this distortion based on estimations of the user's head position, rendering the scene from the user's perspective. To this end, approaches usually apply face-tracking algorithms on the front camera of the mobile device. However, this demands high computational resources and therefore commonly affects the performance of the application beyond the already high computational load of AR applications. In this paper, we present a method to reduce the computational demands for user perspective rendering by applying lightweight optical flow tracking and an estimation of the user's motion before head tracking is started. We demonstrate the suitability of our approach for computationally limited mobile devices and we compare it to device perspective rendering, to head tracked user perspective rendering, as well as to fixed point of view user perspective rendering

    InLoc: Indoor Visual Localization with Dense Matching and View Synthesis

    Get PDF
    We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect to a large indoor 3D map. The contributions of this work are three-fold. First, we develop a new large-scale visual localization method targeted for indoor environments. The method proceeds along three steps: (i) efficient retrieval of candidate poses that ensures scalability to large-scale environments, (ii) pose estimation using dense matching rather than local features to deal with textureless indoor scenes, and (iii) pose verification by virtual view synthesis to cope with significant changes in viewpoint, scene layout, and occluders. Second, we collect a new dataset with reference 6DoF poses for large-scale indoor localization. Query photographs are captured by mobile phones at a different time than the reference 3D map, thus presenting a realistic indoor localization scenario. Third, we demonstrate that our method significantly outperforms current state-of-the-art indoor localization approaches on this new challenging data
    • …
    corecore