79 research outputs found

    Point2Vec for Self-Supervised Representation Learning on Point Clouds

    Full text link
    Recently, the self-supervised learning framework data2vec has shown inspiring performance for various modalities using a masked student-teacher approach. However, it remains open whether such a framework generalizes to the unique challenges of 3D point clouds. To answer this question, we extend data2vec to the point cloud domain and report encouraging results on several downstream tasks. In an in-depth analysis, we discover that the leakage of positional information reveals the overall object shape to the student even under heavy masking and thus hampers data2vec to learn strong representations for point clouds. We address this 3D-specific shortcoming by proposing point2vec, which unleashes the full potential of data2vec-like pre-training on point clouds. Our experiments show that point2vec outperforms other self-supervised methods on shape classification and few-shot learning on ModelNet40 and ScanObjectNN, while achieving competitive results on part segmentation on ShapeNetParts. These results suggest that the learned representations are strong and transferable, highlighting point2vec as a promising direction for self-supervised learning of point cloud representations

    Productive Vision: Methods for Automatic Image Comprehension

    Get PDF
    Image comprehension is the ability to summarize, translate, and answer basic questions about images. Using original techniques for scene object parsing, material labeling, and activity recognition, a system can gather information about the objects and actions in a scene. When this information is integrated into a deep knowledge base capable of inference, the system becomes capable of performing tasks that, when performed by students, are considered by educators to demonstrate comprehension. The vision components of the system consist of the following: object scene parsing by means of visual filters, material scene parsing by superpixel segmentation and kernel descriptors, and activity recognition by action grammars. These techniques are characterized and compared with the state-of-the-art in their respective fields. The output of the vision components is a list of assertions in a Cyc microtheory. By reasoning on these assertions and the rest of the Cyc knowledge base, the system is able to perform a variety of tasks, including the following: Recognize essential parts of objects are likely present in the scene despite not having an explicit detector for them. Recognize the likely presence of objects due to the presence of their essential parts. Improve estimates of both object and material labels by incorporating knowledge about the typical pairings. Label ambiguous objects with a more general label that encompasses both possible labelings. Answer questions about the scene that require inference and give justifications for the answers in natural language. Create a visual representation of the scene in a new medium. Recognize scene similarity even when there is little visual similarity

    Lidar Point Cloud compression, processing and learning for Autonomous Driving

    Get PDF
    As technology advances, cities are getting smarter. Smart mobility is the key element in smart cities and Autonomous Driving (AV) are an essential part of smart mobility. However, the vulnerability of unmanned vehicles can also affect the value of life and human safety. In this paper, we provide a comprehensive analysis of 3D Point-Cloud (3DPC) processing and learning in terms of development, advancement, and performance for the AV system. 3DPC has recently attracted growing interest due to its extensive applications, such as autonomous driving, computer vision, and robotics. Light Detection and Ranging Sensors (LiDAR) is one of the most significant sensors in AV, which collects 3DPC that can accurately capture the outer surfaces of scenes and objects. Learning and processing tools in the 3DPC are essential for creating maps, perceptions, and localization devices in AV. The intention behind 3DPC learning and practical processing tools is to be considered the most essential modules to create, locate, and perceive maps in an AV system. The goal of the study is to know ``what has been tested in AV system so far and what is necessary to make it safer and more practical in AV system.'' We also provide insights into the necessary open problems that are required to be resolved in the future

    Assessing Seagrass Restoration Actions through a Micro-Bathymetry Survey Approach (Italy, Mediterranean Sea)

    Get PDF
    Underwater photogrammetry provides a means of generating high-resolution products such as dense point clouds, 3D models, and orthomosaics with centimetric scale resolutions. Underwater photogrammetric models can be used to monitor the growth and expansion of benthic communities, including the assessment of the conservation status of seagrass beds and their change over time (time lapse micro-bathymetry) with OBIA classifications (Object-Based Image Analysis). However, one of the most complex aspects of underwater photogrammetry is the accuracy of the 3D models for both the horizontal and vertical components used to estimate the surfaces and volumes of biomass. In this study, a photogrammetry-based micro-bathymetry approach was applied to monitor Posidonia oceanica restoration actions. A procedure for rectifying both the horizontal and vertical elevation data was developed using soundings from high-resolution multibeam bathymetry. Furthermore, a 3D trilateration technique was also tested to collect Ground Control Points (GCPs) together with reference scale bars, both used to estimate the accuracy of the models and orthomosaics. The root mean square error (RMSE) value obtained for the horizontal planimetric measurements was 0.05 m, while the RMSE value for the depth was 0.11 m. Underwater photogrammetry, if properly applied, can provide very high-resolution and accurate models for monitoring seagrass restoration actions for ecological recovery and can be useful for other research purposes in geological and environmental monitoring

    Effects of Aerial LiDAR Data Density on the Accuracy of Building Reconstruction

    Get PDF
    Previous work has identified a positive relationship between the density of aerial LiDAR input for building reconstruction and the accuracy of the resulting reconstructed models. We hypothesize a point of diminished returns at which higher data density no longer contributes meaningfully to higher accuracy in the end product. We investigate this relationship by subsampling a high-density dataset from the City of Surrey, BC to different densities and inputting each subsampled dataset to reconstruction using two different reconstruction methods. We then determine the accuracy of reconstruction based on manually created reference data, in terms of both 2D footprint accuracy and 3D model accuracy. We find that there is no quantitative evidence for meaningfully improved output accuracy from densities higher than 4 p/m2 for either method, although aesthetic improvements at higher point cloud densities are noted for one method

    Reconstruction de formes tubulaires à partir de nuages de points : application à l’estimation de la géométrie forestière

    Get PDF
    Les capacités des technologies de télédétection ont augmenté exponentiellement au cours des dernières années : de nouveaux scanners fournissent maintenant une représentation géométrique de leur environnement sous la forme de nuage de points avec une précision jusqu'ici inégalée. Le traitement de nuages de points est donc devenu une discipline à part entière avec ses problématiques propres et de nombreux défis à relever. Le coeur de cette thèse porte sur la modélisation géométrique et introduit une méthode robuste d'extraction de formes tubulaires à partir de nuages de points. Nous avons choisi de tester nos méthodes dans le contexte applicatif difficile de la foresterie pour mettre en valeur la robustesse de nos algorithmes et leur application à des données volumineuses. Nos méthodes intègrent les normales aux points comme information supplémentaire pour atteindre les objectifs de performance nécessaire au traitement de nuages de points volumineux.Cependant, ces normales ne sont généralement pas fournies par les capteurs, il est donc nécessaire de les pré-calculer.Pour préserver la rapidité d'exécution, notre premier développement a donc consisté à présenter une méthode rapide d'estimation de normales. Pour ce faire nous avons approximé localement la géométrie du nuage de points en utilisant des "patchs" lisses dont la taille s'adapte à la complexité locale des nuages de points. Nos travaux se sont ensuite concentrés sur l’extraction robuste de formes tubulaires dans des nuages de points denses, occlus, bruités et de densité inhomogène. Dans cette optique, nous avons développé une variante de la transformée de Hough dont la complexité est réduite grâce aux normales calculées. Nous avons ensuite couplé ces travaux à une proposition de contours actifs indépendants de leur paramétrisation. Cette combinaison assure la cohérence interne des formes reconstruites et s’affranchit ainsi des problèmes liés à l'occlusion, au bruit et aux variations de densité. Notre méthode a été validée en environnement complexe forestier pour reconstruire des troncs d'arbre afin d'en relever les qualités par comparaison à des méthodes existantes. La reconstruction de troncs d'arbre ouvre d'autres questions à mi-chemin entre foresterie et géométrie. La segmentation des arbres d'une placette forestière est l'une d’entre elles. C'est pourquoi nous proposons également une méthode de segmentation conçue pour contourner les défauts des nuages de points forestiers et isoler les différents objets d'un jeu de données. Durant nos travaux nous avons utilisé des approches de modélisation pour répondre à des questions géométriques, et nous les avons appliqué à des problématiques forestières.Il en résulte un pipeline de traitements cohérent qui, bien qu'illustré sur des données forestières, est applicable dans des contextes variés.Abstract : The potential of remote sensing technologies has recently increased exponentially: new sensors now provide a geometric representation of their environment in the form of point clouds with unrivalled accuracy. Point cloud processing hence became a full discipline, including specific problems and many challenges to face. The core of this thesis concerns geometric modelling and introduces a fast and robust method for the extraction of tubular shapes from point clouds. We hence chose to test our method in the difficult applicative context of forestry in order to highlight the robustness of our algorithms and their application to large data sets. Our methods integrate normal vectors as a supplementary geometric information in order to achieve the performance goal necessary for large point cloud processing. However, remote sensing techniques do not commonly provide normal vectors, thus they have to be computed. Our first development hence consisted in the development of a fast normal estimation method on point cloud in order to reduce the computing time on large point clouds. To do so, we locally approximated the point cloud geometry using smooth ''patches`` of points which size adapts to the local complexity of the point cloud geometry. We then focused our work on the robust extraction of tubular shapes from dense, occluded, noisy point clouds suffering from non-homogeneous sampling density. For this objective, we developed a variant of the Hough transform which complexity is reduced thanks to the computed normal vectors. We then combined this research with a new definition of parametrisation-invariant active contours. This combination ensures the internal coherence of the reconstructed shapes and alleviates issues related to occlusion, noise and variation of sampling density. We validated our method in complex forest environments with the reconstruction of tree stems to emphasize its advantages and compare it to existing methods. Tree stem reconstruction also opens new perspectives halfway in between forestry and geometry. One of them is the segmentation of trees from a forest plot. Therefore we also propose a segmentation approach designed to overcome the defects of forest point clouds and capable of isolating objects inside a point cloud. During our work we used modelling approaches to answer geometric questions and we applied our methods to forestry problems. Therefore, our studies result in a processing pipeline adapted to forest point cloud analyses, but the general geometric algorithms we propose can also be applied in various contexts

    Silhouette-Aware Warping for Image-Based Rendering

    Get PDF
    International audienceImage-based rendering (IBR) techniques allow capture and display of 3D environments using photographs. Modern IBR pipelines reconstruct proxy geometry using multi-view stereo, reproject the photographs onto the proxy and blend them to create novel views. The success of these methods depends on accurate 3D proxies, which are difficult to obtain for complex objects such as trees and cars. Large number of input images do not improve reconstruction proportionally; surface extraction is challenging even from dense range scans for scenes containing such objects. Our approach does not depend on dense accurate geometric reconstruction; instead we compensate for sparse 3D information by variational image warping. In particular, we formulate silhouette-aware warps that preserve salient depth discontinuities. This improves the rendering of difficult foreground objects, even when deviating from view interpolation. We use a semi-automatic step to identify depth discontinuities and extract a sparse set of depth constraints used to guide the warp. Our framework is lightweight and results in good quality IBR for previously challenging environments

    Machine Learning in Adversarial Environments

    Full text link
    Machine Learning, especially Deep Neural Nets (DNNs), has achieved great success in a variety of applications. Unlike classical algorithms that could be formally analyzed, there is less understanding of neural network-based learning algorithms. This lack of understanding through either formal methods or empirical observations results in potential vulnerabilities that could be exploited by adversaries. This also hinders the deployment and adoption of learning methods in security-critical systems. Recent works have demonstrated that DNNs are vulnerable to carefully crafted adversarial perturbations. We refer to data instances with added adversarial perturbations as “adversarial examples”. Such adversarial examples can mislead DNNs to produce adversary-selected results. Furthermore, it can cause a DNN system to misbehavior in unexpected and potentially dangerous ways. In this context, in this thesis, we focus on studying the security problem of current DNNs from the viewpoints of both attack and defense. First, we explore the space of attacks against DNNs during the test time. We revisit the integrity of Lp regime and propose a new and rigorous threat model of adversarial examples. Based on this new threat model, we present the technique to generate adversarial examples in the digital space. Second, we study the physical consequence of adversarial examples in the 3D and physical spaces. We first study the vulnerabilities of various vision systems by simulating the photo0taken process by using the physical renderer. To further explore the physical consequence in the real world, we select the safety-critical application of autonomous driving as the target system and study the vulnerability of the LiDAR-perceptual module. These studies show the potentially severe consequences of adversarial examples and raise awareness on its risks. Last but not least, we develop solutions to defend against adversarial examples. We propose a consistency-check based method to detect adversarial examples by leveraging property of either the learning model or the data. We show two examples in the segmentation task (leveraging learning model) and video data (leveraging the data), respectively.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/162944/1/xiaocw_1.pd
    • …
    corecore