33 research outputs found
Algorithms for the reconstruction, analysis, repairing and enhancement of 3D urban models from multiple data sources
Over the last few years, there has been a notorious growth in the field of digitization of 3D buildings and urban environments. The substantial improvement of both scanning hardware and reconstruction algorithms has led to the development of representations of buildings and cities that can be remotely transmitted and inspected in real-time. Among the applications that implement these technologies are several GPS navigators and virtual globes such as Google Earth or the tools provided by the Institut Cartogràfic i Geològic de Catalunya.
In particular, in this thesis, we conceptualize cities as a collection of individual buildings. Hence, we focus on the individual processing of one structure at a time, rather than on the larger-scale processing of urban environments.
Nowadays, there is a wide diversity of digitization technologies, and the choice of the appropriate one is key for each particular application. Roughly, these techniques can be grouped around three main families:
- Time-of-flight (terrestrial and aerial LiDAR).
- Photogrammetry (street-level, satellite, and aerial imagery).
- Human-edited vector data (cadastre and other map sources).
Each of these has its advantages in terms of covered area, data quality, economic cost, and processing effort.
Plane and car-mounted LiDAR devices are optimal for sweeping huge areas, but acquiring and calibrating such devices is not a trivial task. Moreover, the capturing process is done by scan lines, which need to be registered using GPS and inertial data. As an alternative, terrestrial LiDAR devices are more accessible but cover smaller areas, and their sampling strategy usually produces massive point clouds with over-represented plain regions. A more inexpensive option is street-level imagery. A dense set of images captured with a commodity camera can be fed to state-of-the-art multi-view stereo algorithms to produce realistic-enough reconstructions. One other advantage of this approach is capturing high-quality color data, whereas the geometric information is usually lacking.
In this thesis, we analyze in-depth some of the shortcomings of these data-acquisition methods and propose new ways to overcome them. Mainly, we focus on the technologies that allow high-quality digitization of individual buildings. These are terrestrial LiDAR for geometric information and street-level imagery for color information.
Our main goal is the processing and completion of detailed 3D urban representations. For this, we will work with multiple data sources and combine them when possible to produce models that can be inspected in real-time. Our research has focused on the following contributions:
- Effective and feature-preserving simplification of massive point clouds.
- Developing normal estimation algorithms explicitly designed for LiDAR data.
- Low-stretch panoramic representation for point clouds.
- Semantic analysis of street-level imagery for improved multi-view stereo reconstruction.
- Color improvement through heuristic techniques and the registration of LiDAR and imagery data.
- Efficient and faithful visualization of massive point clouds using image-based techniques.Durant els darrers anys, hi ha hagut un creixement notori en el camp de la digitalització d'edificis en 3D i entorns urbans. La millora substancial tant del maquinari d'escaneig com dels algorismes de reconstrucció ha portat al desenvolupament de representacions d'edificis i ciutats que es poden transmetre i inspeccionar remotament en temps real. Entre les aplicacions que implementen aquestes tecnologies hi ha diversos navegadors GPS i globus virtuals com Google Earth o les eines proporcionades per l'Institut Cartogràfic i Geològic de Catalunya. En particular, en aquesta tesi, conceptualitzem les ciutats com una col·lecció d'edificis individuals. Per tant, ens centrem en el processament individual d'una estructura a la vegada, en lloc del processament a gran escala d'entorns urbans. Avui en dia, hi ha una àmplia diversitat de tecnologies de digitalització i la selecció de l'adequada és clau per a cada aplicació particular. Aproximadament, aquestes tècniques es poden agrupar en tres famílies principals: - Temps de vol (LiDAR terrestre i aeri). - Fotogrametria (imatges a escala de carrer, de satèl·lit i aèries). - Dades vectorials editades per humans (cadastre i altres fonts de mapes). Cadascun d'ells presenta els seus avantatges en termes d'àrea coberta, qualitat de les dades, cost econòmic i esforç de processament. Els dispositius LiDAR muntats en avió i en cotxe són òptims per escombrar àrees enormes, però adquirir i calibrar aquests dispositius no és una tasca trivial. A més, el procés de captura es realitza mitjançant línies d'escaneig, que cal registrar mitjançant GPS i dades inercials. Com a alternativa, els dispositius terrestres de LiDAR són més accessibles, però cobreixen àrees més petites, i la seva estratègia de mostreig sol produir núvols de punts massius amb regions planes sobrerepresentades. Una opció més barata són les imatges a escala de carrer. Es pot fer servir un conjunt dens d'imatges capturades amb una càmera de qualitat mitjana per obtenir reconstruccions prou realistes mitjançant algorismes estèreo d'última generació per produir. Un altre avantatge d'aquest mètode és la captura de dades de color d'alta qualitat. Tanmateix, la informació geomètrica resultant sol ser de baixa qualitat. En aquesta tesi, analitzem en profunditat algunes de les mancances d'aquests mètodes d'adquisició de dades i proposem noves maneres de superar-les. Principalment, ens centrem en les tecnologies que permeten una digitalització d'alta qualitat d'edificis individuals. Es tracta de LiDAR terrestre per obtenir informació geomètrica i imatges a escala de carrer per obtenir informació sobre colors. El nostre objectiu principal és el processament i la millora de representacions urbanes 3D amb molt detall. Per a això, treballarem amb diverses fonts de dades i les combinarem quan sigui possible per produir models que es puguin inspeccionar en temps real. La nostra investigació s'ha centrat en les següents contribucions: - Simplificació eficaç de núvols de punts massius, preservant detalls d'alta resolució. - Desenvolupament d'algoritmes d'estimació normal dissenyats explícitament per a dades LiDAR. - Representació panoràmica de baixa distorsió per a núvols de punts. - Anàlisi semàntica d'imatges a escala de carrer per millorar la reconstrucció estèreo de façanes. - Millora del color mitjançant tècniques heurístiques i el registre de dades LiDAR i imatge. - Visualització eficient i fidel de núvols de punts massius mitjançant tècniques basades en imatges
SMPLitex: A Generative Model and Dataset for 3D Human Texture Estimation from Single Image
We propose SMPLitex, a method for estimating and manipulating the complete 3D
appearance of humans captured from a single image. SMPLitex builds upon the
recently proposed generative models for 2D images, and extends their use to the
3D domain through pixel-to-surface correspondences computed on the input image.
To this end, we first train a generative model for complete 3D human
appearance, and then fit it into the input image by conditioning the generative
model to the visible parts of the subject. Furthermore, we propose a new
dataset of high-quality human textures built by sampling SMPLitex conditioned
on subject descriptions and images. We quantitatively and qualitatively
evaluate our method in 3 publicly available datasets, demonstrating that
SMPLitex significantly outperforms existing methods for human texture
estimation while allowing for a wider variety of tasks such as editing,
synthesis, and manipulationComment: Accepted at BMVC 2023. Project website:
https://dancasas.github.io/projects/SMPLite
3D Reconstruction of vegetation from orthophotos
[CATALÀ] Mitjançant tècniques de Visió per Computador som capaços de detectar i classificar la flora present en una ortofoto. A partir d'aquesta informació, i d'un conjunt de models de plantes en 3D que generem, elaborem una reproducció 3D aproximada, però factible, de la vegetació present en l'ortofoto.[ANGLÈS] Using Computer Vision techniques, we are able to detect and classify the flora present in an orthphoto. With this information, and with a self-produced base set of 3D plants, we generate an approximated, but feasible, 3D reproduction of the vegetation that appears in the orthophoto
Revisiting Poisson-disk Subsampling for Massive Point Cloud Decimation
Scanning devices often produce point clouds exhibiting highly uneven
distributions of point samples across the surfaces being captured. Different
point cloud subsampling techniques have been proposed to generate more evenly
distributed samples. Poisson-disk sampling approaches assign each sample a cost
value so that subsampling reduces to sorting the samples by cost and then
removing the desired ratio of samples with the highest cost. Unfortunately,
these approaches compute the sample cost using pairwise distances of the points
within a constant search radius, which is very costly for massive point clouds
with uneven densities. In this paper, we revisit Poisson-disk sampling for
point clouds. Instead of optimizing for equal densities, we propose to maximize
the distance to the closest point, which is equivalent to estimating the local
point density as a value inversely proportional to this distance. This
algorithm can be efficiently implemented using k nearest-neighbors searches.
Besides a kd-tree, our algorithm also uses a voxelization to speed up the
searches required to compute per-sample costs. We propose a new strategy to
minimize cost updates that is amenable for out-of-core operation. We
demonstrate the benefits of our approach in terms of performance, scalability,
and output quality. We also discuss extensions based on adding
orientation-based and color-based terms to the cost function
PERGAMO: Personalized 3D Garments from Monocular Video
Clothing plays a fundamental role in digital humans. Current approaches to
animate 3D garments are mostly based on realistic physics simulation, however,
they typically suffer from two main issues: high computational run-time cost,
which hinders their development; and simulation-to-real gap, which impedes the
synthesis of specific real-world cloth samples. To circumvent both issues we
propose PERGAMO, a data-driven approach to learn a deformable model for 3D
garments from monocular images. To this end, we first introduce a novel method
to reconstruct the 3D geometry of garments from a single image, and use it to
build a dataset of clothing from monocular videos. We use these 3D
reconstructions to train a regression model that accurately predicts how the
garment deforms as a function of the underlying body pose. We show that our
method is capable of producing garment animations that match the real-world
behaviour, and generalizes to unseen body motions extracted from motion capture
dataset.Comment: Published at Computer Graphics Forum (Proc. of ACM/SIGGRAPH SCA),
2022. Project website http://mslab.es/projects/PERGAMO
A software framework for the development of projection-based augmented reality systems
Despite the large amount of methods and applications of augmented reality, there is little homogenization on the software platforms that support them. An exception may be the low level control software that is provided by some high profile vendors such as Qualcomm and Metaio. However, these provide fine grain modules for e.g. element tracking. We are more concerned on the application framework, that includes the control of the devices working together for the development of the AR experience. In this paper we present a software framework that can be used for the development of AR applications based on camera-projector pairs, that is suitable for both fixed, and nomadic setups.Peer ReviewedPostprint (author's final draft
Error-aware construction and rendering of multi-scan panoramas from massive point clouds
Obtaining 3D realistic models of urban scenes from accurate range data is nowadays an important research topic, with applications in a variety of fields ranging from Cultural Heritage and digital 3D archiving to monitoring of public works. Processing massive point clouds acquired from laser scanners involves a number of challenges, from data management to noise removal, model compression and interactive visualization and inspection. In this paper, we present a new methodology for the reconstruction of 3D scenes from massive point clouds coming from range lidar sensors. Our proposal includes a panorama-based compact reconstruction where colors and normals are estimated robustly through an error-aware algorithm that takes into account the variance of expected errors in depth measurements. Our representation supports efficient, GPU-based visualization with advanced lighting effects. We discuss the proposed algorithms in a practical application on urban and historical preservation, described by a massive point cloud of 3.5 billion points. We show that we can achieve compression rates higher than 97% with good visual quality during interactive inspections.Peer ReviewedPostprint (author's final draft
A Tool for N-way analysis of programming exercises
Programming exercises are a corner stone in Computer Science courses. If used properly, these exercises provide valuable feedback both to students and instructors. Unfortunately, the assessment of student submissions through code inspection requires a considerable amount of time. In this work we present an interactive tool to support the analysis of code submissions before, during, and after grading. The key idea is to compute a dissimilarity matrix for code submissions, using a metric that incorporates syntactic, semantic and functional aspects of the code. This matrix is used to embed the submissions in 2D space, so that similar submissions are mapped to nearby locations. The tool allows users to visually identify clusters, inspect individual submissions, and perform detailed pair-wise and abridged n-way comparisons. Finally, our approach facilitates comparative scoring by presenting submissions in a nearly-optimal order, i.e. similar submissions appear close in the sequence. Our initial evaluation indicates that the tool (currently supporting C++/GLSL code) provides clear benefits both to students (more fair scores, less bias, more consistent feedback) and instructors (less effort, better feedback on student performance).This work has been funded by the Spanish Ministry of Economy and Competitiveness and FEDER Grant TIN2017-88515-C2-1-R.Peer ReviewedPostprint (published version
Sweep encoding: Serializing space subdivision schemes for optimal slicing
Slicing a model (computing thin slices of a geometric or volumetric model with a sweeping plane) is necessary for several applications ranging from 3D printing to medical imaging. This paper introduces a technique designed to compute these slices efficiently, even for huge and complex models. We voxelize the volume of the model at a required resolution and show how to encode this voxelization in an out-of-core octree using a novel Sweep Encoding linearization. This approach allows for efficient slicing with bounded cost per slice. We discuss specific applications, including 3D printing, and compare these octrees’ performance against the standard representations in the literature.This work has been partially funded by the Spanish Ministry of Science and Innovation (MCIN / AEI / 10.13039/501100011033) and FEDER (‘‘A way to make Europe’’) under grant TIN2017- 88515-C2-1-R.Peer ReviewedPostprint (published version
Segmentation of aerial images for plausible detail synthesis
The visual enrichment of digital terrain models with plausible synthetic detail requires the segmentation of aerial images into a suitable collection of categories. In this paper we present a complete pipeline for segmenting high-resolution aerial images into a user-defined set of categories distinguishing e.g. terrain, sand, snow, water, and different types of vegetation. This segmentation-for-synthesis problem implies that per-pixel categories must be established according to the algorithms chosen for rendering the synthetic detail. This precludes the definition of a universal set of labels and hinders the construction of large training sets. Since artists might choose to add new categories on the fly, the whole pipeline must be robust against unbalanced datasets, and fast on both training and inference. Under these constraints, we analyze the contribution of common per-pixel descriptors, and compare the performance of state-of-the-art supervised learning algorithms. We report the findings of two user studies. The first one was conducted to analyze human accuracy when manually labeling aerial images. The second user study compares detailed terrains built using different segmentation strategies, including official land cover maps. These studies demonstrate that our approach can be used to turn digital elevation models into fully-featured, detailed terrains with minimal authoring efforts.Peer ReviewedPostprint (author's final draft