6 research outputs found

    Does Monocular Depth Estimation Provide Better Pre-training than Classification for Semantic Segmentation?

    Full text link
    Training a deep neural network for semantic segmentation is labor-intensive, so it is common to pre-train it for a different task, and then fine-tune it with a small annotated dataset. State-of-the-art methods use image classification for pre-training, which introduces uncontrolled biases. We test the hypothesis that depth estimation from unlabeled videos may provide better pre-training. Despite the absence of any semantic information, we argue that estimating scene geometry is closer to the task of semantic segmentation than classifying whole images into semantic classes. Since analytical validation is intractable, we test the hypothesis empirically by introducing a pre-training scheme that yields an improvement of 5.7% mIoU and 4.1% pixel accuracy over classification-based pre-training. While annotation is not needed for pre-training, it is needed for testing the hypothesis. We use the KITTI (outdoor) and NYU-V2 (indoor) benchmarks to that end, and provide an extensive discussion of the benefits and limitations of the proposed scheme in relation to existing unsupervised, self-supervised, and semi-supervised pre-training protocols

    Unsupervised object-centric video generation and decomposition in 3D

    Full text link
    A natural approach to generative modeling of videos is to represent them as a composition of moving objects. Recent works model a set of 2D sprites over a slowly-varying background, but without considering the underlying 3D scene that gives rise to them. We instead propose to model a video as the view seen while moving through a scene with multiple 3D objects and a 3D background. Our model is trained from monocular videos without any supervision, yet learns to generate coherent 3D scenes containing several moving objects. We conduct detailed experiments on two datasets, going beyond the visual complexity supported by state-of-the-art generative approaches. We evaluate our method on depth-prediction and 3D object detection -- tasks which cannot be addressed by those earlier works -- and show it out-performs them even on 2D instance segmentation and tracking.Comment: Appeared at NeurIPS 2020. Project page: http://pmh47.net/o3v

    Deep Learning of Microstructures

    Get PDF
    The internal structure of materials also called the microstructure plays a critical role in the properties and performance of materials. The chemical element composition is one of the most critical factors in changing the structure of materials. However, the chemical composition alone is not the determining factor, and a change in the production process can also significantly alter the materials\u27 structure. Therefore, many efforts have been made to discover and improve production methods to optimize the functional properties of materials. The most critical challenge in finding materials with enhanced properties is to understand and define the salient features of the structure of materials that have the most significant impact on the desired property. In other words, by process, structure, and property (PSP) linkages, the effect of changing process variables on material structure and, consequently, the property can be examined and used as a powerful tool in material design with desirable characteristics. In particular, forward PSP linkages construction has received considerable attention thanks to the sophisticated physics-based models. Recently, machine learning (ML), and data science have also been used as powerful tools to find PSP linkages in materials science. One key advantage of the ML-based models is their ability to construct both forward and inverse PSP linkages. Early ML models in materials science were primarily focused on process-property linkages construction. Recently, more microstructures are included in the materials design ML models. However, the inverse design of microstructures, i.e., the prediction of vii process and chemistry from a microstructure morphology image have received limited attention. This is a critical knowledge gap to address specifically for the problems that the ideal microstructure or morphology with the specific chemistry associated with the morphological domains are known, but the chemistry and processing which would lead to that ideal morphology are unknown. In this study, first, we propose a framework based on a deep learning approach that enables us to predict the chemistry and processing history just by reading the morphological distribution of one element. As a case study, we used a dataset from spinodal decomposition simulation of Fe-Cr-Co alloy created by the phase-field method. The mixed dataset, which includes both images, i.e., the morphology of Fe distribution, and continuous data, i.e., the Fe minimum and maximum concentration in the microstructures, are used as input data, and the spinodal temperature and initial chemical composition are utilized as the output data to train the proposed deep neural network. The proposed convolutional layers were compared with pretrained EfficientNet convolutional layers as transfer learning in microstructure feature extraction. The results show that the trained shallow network is effective for chemistry prediction. However, accurate prediction of processing temperature requires more complex feature extraction from the morphology of the microstructure. We benchmarked the model predictive accuracy for real alloy systems with a Fe-Cr-Co transmission electron microscopy micrograph. The predicted chemistry and heat treatment temperature were in good agreement with the ground truth. The treatment time was considered to be constant in the first study. In the second work, we propose a fused-data deep learning framework that can predict the heat treatment time as well as temperature and initial chemical compositions by reading the morphology of Fe distribution and its concentration. The results show that the trained deep neural network has the highest accuracy for chemistry and then time and temperature. We identified two scenarios for inaccurate predictions; 1) There are several paths for an identical microstructure, and 2) Microstructures reach steady-state morphologies after a long time of aging. The error analysis shows that most of the wrong predictions are not wrong, but the other right answers. We validated the model successfully with an experimental Fe-Cr-Co transmission electron microscopy micrograph. Finally, since the data generation by simulation is computationally expensive, we propose a quick and accurate Predictive Recurrent Neural Network (PredRNN) model for the microstructure evolution prediction. Essentially, microstructure evolution prediction is a spatiotemporal sequence prediction problem, where the prediction of material microstructure is difficult due to different process histories and chemistry. As a case study, we used a dataset from spinodal decomposition simulation of Fe-Cr-Co alloy created by the phase-field method for training and predicting future microstructures by previous observations. The results show that the trained network is capable of efficient prediction of microstructure evolution
    corecore