    Scalability of Learning Tasks on 3D CAE Models Using Point Cloud Autoencoders

    Geometric Deep Learning (GDL) methods have recently gained interest as powerful, high-dimensional models for approaching various geometry processing tasks. However, training deep neural network models on geometric input requires considerable computational effort. Even more so, if one considers typical problem sizes found in application domains such as engineering tasks, where geometric data are often orders of magnitude larger than the inputs currently considered in GDL literature. Hence, an assessment of the scalability of the training task is necessary, where model and data set parameters can be mapped to the computational demand during training. The present paper therefore studies the effects of data set size and the number of free model parameters on the computational effort of training a Point Cloud Autoencoder (PC-AE). We further review pre-processing techniques to obtain efficient representations of high-dimensional inputs to the PC-AE and investigate the effects of these techniques on the information abstracted by the trained model. We perform these experiments on synthetic geometric data inspired by engineering applications using computing hardware with particularly recent graphics processing units (GPUs) with high memory specifications. The present study thus provides a comprehensive evaluation of how to scale geometric deep learning architectures to high-dimensional inputs to allow for an application of state-of-the-art deep learning methods in real-world tasks.Algorithms and the Foundations of Software technolog

    Rekonstruktion und skalierbare Detektion und Verfolgung von 3D Objekten

    The task of detecting objects in images is essential for autonomous systems to categorize, comprehend and eventually navigate or manipulate its environment. Since many applications demand not only detection of objects but also the estimation of their exact poses, 3D CAD models can prove helpful since they provide means for feature extraction and hypothesis refinement. This work, therefore, explores two paths: firstly, we will look into methods to create richly-textured and geometrically accurate models of real-life objects. Using these reconstructions as a basis, we will investigate on how to improve in the domain of 3D object detection and pose estimation, focusing especially on scalability, i.e. the problem of dealing with multiple objects simultaneously.Objekterkennung in Bildern ist für ein autonomes System von entscheidender Bedeutung, um seine Umgebung zu kategorisieren, zu erfassen und schließlich zu navigieren oder zu manipulieren. Da viele Anwendungen nicht nur die Erkennung von Objekten, sondern auch die Schätzung ihrer exakten Positionen erfordern, können sich 3D-CAD-Modelle als hilfreich erweisen, da sie Mittel zur Merkmalsextraktion und Verfeinerung von Hypothesen bereitstellen. In dieser Arbeit werden daher zwei Wege untersucht: Erstens werden wir Methoden untersuchen, um strukturreiche und geometrisch genaue Modelle realer Objekte zu erstellen. Auf der Grundlage dieser Konstruktionen werden wir untersuchen, wie sich der Bereich der 3D-Objekterkennung und der Posenschätzung verbessern lässt, wobei insbesondere die Skalierbarkeit im Vordergrund steht, d.h. das Problem der gleichzeitigen Bearbeitung mehrerer Objekte

    Learning-based generative representations for automotive design optimization

    In automotive design optimizations, engineers intuitively look for suitable representations of CAE models that can be used across different optimization problems. Determining a suitable compact representation of 3D CAE models facilitates faster search and optimization of 3D designs. Therefore, to support novice designers in the automotive design process, we envision a cooperative design system (CDS) which learns the experience embedded in past optimization data and is able to provide assistance to the designer while performing an engineering design optimization task. The research in this thesis addresses different aspects that can be combined to form a CDS framework. First, based on the survey of deep learning techniques, a point cloud variational autoencoder (PC-VAE) is adapted from the literature, extended and evaluated as a shape generative model in design optimizations. The performance of the PC-VAE is verified with respect to state-of-the-art architectures. The PC-VAE is capable of generating a continuous low-dimensional search space for 3D designs, which further supports the generation of novel realistic 3D designs through interpolation and sampling in the latent space. In general, while designing a 3D car design, engineers need to consider multiple structural or functional performance criteria of a 3D design. Hence, in the second step, the latent representations of the PC-VAE are evaluated for generating novel designs satisfying multiple criteria and user preferences. A seeding method is proposed to provide a warm start to the optimization process and improve convergence time. Further, to replace expensive simulations for performance estimation in an optimization task, surrogate models are trained to map each latent representation of an input 3D design to their respective geometric and functional performance measures. However, the performance of the PC-VAE is less consistent due to additional regularization of the latent space. Thirdly, to better understand which distinct region of the input 3D design is learned by a particular latent variable of the PC-VAE, a new deep generative model is proposed (Split-AE), which is an extension of the existing autoencoder architecture. The Split-AE learns input 3D point cloud representations and generates two sets of latent variables for each 3D design. The first set of latent variables, referred to as content, which helps to represent an overall underlying structure of the 3D shape to discriminate across other semantic shape categories. The second set of latent variables refers to the style, which represents the unique shape part of the input 3D shape and this allows grouping of shapes into shape classes. The reconstruction and latent variables disentanglement properties of the Split-AE are compared with other state-of-the-art architectures. In a series of experiments, it is shown that for given input shapes, the Split-AE is capable of generating the content and style variables which gives the flexibility to transfer and combine style features between different shapes. Thus, the Split-AE is able to disentangle features with minimum supervision and helps in generating novel shapes that are modified versions of the existing designs. Lastly, to demonstrate the application of our initial envisioned CDS, two interactive systems were developed to assist designers in exploring design ideas. In the first CDS framework, the latent variables of the PC-VAE are integrated with a graphical user interface. This framework enables the designer to explore designs taking into account the data-driven knowledge and different performance measures of 3D designs. The second interactive system aims to guide the designers to achieve their design targets, for which past human experiences of performing 3D design modifications are captured and learned using a machine learning model. The trained model is then used to guide the (novice) engineers and designers by predicting the next step of design modification based on the current applied changes

    Predicting the internal model of a robotic system from its morphology

    The estimation of the internal model of a robotic system results from the interaction of its morphology, sensors and actuators, with a particular environment. Model learning techniques, based on supervised machine learning, are widespread for determining the internal model. An important limitation of such approaches is that once a model has been learnt, it does not behave properly when the robot morphology is changed. From this it follows that there must exist a relationship between them. We propose a model for this correlation between the morphology and the internal model parameters, so that a new internal model can be predicted when the morphological parameters are modified. Di erent neural network architectures are proposed to address this high dimensional regression problem. A case study is analyzed in detail to illustrate and evaluate the performance of the approach, namely, a pan-tilt robot head executing saccadic movements. The best results are obtained for an architecture with parallel neural networks due to the independence of its outputs. Theses results can have a great significance since the predicted parameters can dramatically speed up the adaptation process following a change in morpholog

    Local Deep Implicit Functions for 3D Shape

    The goal of this project is to learn a 3D shape representation that enables accurate surface reconstruction, compact storage, efficient computation, consistency for similar shapes, generalization across diverse shape categories, and inference from depth camera observations. Towards this end, we introduce Local Deep Implicit Functions (LDIF), a 3D shape representation that decomposes space into a structured set of learned implicit functions. We provide networks that infer the space decomposition and local deep implicit functions from a 3D mesh or posed depth image. During experiments, we find that it provides 10.3 points higher surface reconstruction accuracy (F-Score) than the state-of-the-art (OccNet), while requiring fewer than 1 percent of the network parameters. Experiments on posed depth image completion and generalization to unseen classes show 15.8 and 17.8 point improvements over the state-of-the-art, while producing a structured 3D representation for each input with consistency across diverse shape collections.Comment: Camera ready version for CVPR 2020 Oral. Prior to review, this paper was referred to as DSIF, "Deep Structured Implicit Functions." 11 pages, 9 figures. Project video at https://youtu.be/3RAITzNWVJ

    Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies

    Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot

    Novel deep cross-domain framework for fault diagnosis or rotary machinery in prognostics and health management

    Improving the reliability of engineered systems is a crucial problem in many applications in various engineering fields, such as aerospace, nuclear energy, and water declination industries. This requires efficient and effective system health monitoring methods, including processing and analyzing massive machinery data to detect anomalies and performing diagnosis and prognosis. In recent years, deep learning has been a fast-growing field and has shown promising results for Prognostics and Health Management (PHM) in interpreting condition monitoring signals such as vibration, acoustic emission, and pressure due to its capacity to mine complex representations from raw data. This doctoral research provides a systematic review of state-of-the-art deep learning-based PHM frameworks, an empirical analysis on bearing fault diagnosis benchmarks, and a novel multi-source domain adaptation framework. It emphasizes the most recent trends within the field and presents the benefits and potentials of state-of-the-art deep neural networks for system health management. Besides, the limitations and challenges of the existing technologies are discussed, which leads to opportunities for future research. The empirical study of the benchmarks highlights the evaluation results of the existing models on bearing fault diagnosis benchmark datasets in terms of various performance metrics such as accuracy and training time. The result of the study is very important for comparing or testing new models. A novel multi-source domain adaptation framework for fault diagnosis of rotary machinery is also proposed, which aligns the domains in both feature-level and task-level. The proposed framework transfers the knowledge from multiple labeled source domains into a single unlabeled target domain by reducing the feature distribution discrepancy between the target domain and each source domain. Besides, the model can be easily reduced to a single-source domain adaptation problem. Also, the model can be readily updated to unsupervised domain adaptation problems in other fields such as image classification and image segmentation. Further, the proposed model is modified with a novel conditional weighting mechanism that aligns the class-conditional probability of the domains and reduces the effect of irrelevant source domain which is a critical issue in multi-source domain adaptation algorithms. The experimental verification results show the superiority of the proposed framework over state-of-the-art multi-source domain-adaptation models