35 research outputs found

    3D Object classification using a volumetric deep neural network: An efficient Octree Guided Auxiliary Learning approach

    Get PDF
    We consider the recent challenges of 3D shape analysis based on a volumetric CNN that requires a huge computational power. This high-cost approach forces to reduce the volume resolutions when applying 3D CNN on volumetric data. In this context, we propose a multiorientation volumetric deep neural network (MV-DNN) for 3D object classification with octree generating low-cost volumetric features. In comparison to conventional octree representations, we propose to limit the octree partition to a certain depth to reserve all leaf octants with sparsity features. This allows for improved learning of complex 3D features and increased prediction of object labels at both low and high resolutions. Our auxiliary learning approach predicts object classes based on the subvolume parts of a 3D object that improve the classification accuracy compared to other existing 3D volumetric CNN methods. In addition, the influence of views and depths of the 3D model on the classification performance is investigated through extensive experiments applied to the ModelNet40 database. Our deep learning framework runs significantly faster and consumes less memory than full voxel representations and demonstrate the effectiveness of our octree-based auxiliary learning approach for exploring high resolution 3D models. Experimental results reveal the superiority of our MV-DNN that achieves better classification accuracy compared to state-of-art methods on two public databases

    3D Object Classification Using a Volumetric Deep Neural Network: An Efficient Octree Guided Auxiliary Learning Approach

    Full text link
    © 2013 IEEE. We consider the recent challenges of 3D shape analysis based on a volumetric CNN that requires a huge computational power. This high-cost approach forces to reduce the volume resolutions when applying 3D CNN on volumetric data. In this context, we propose a multiorientation volumetric deep neural network (MV-DNN) for 3D object classification with octree generating low-cost volumetric features. In comparison to conventional octree representations, we propose to limit the octree partition to a certain depth to reserve all leaf octants with sparsity features. This allows for improved learning of complex 3D features and increased prediction of object labels at both low and high resolutions. Our auxiliary learning approach predicts object classes based on the subvolume parts of a 3D object that improve the classification accuracy compared to other existing 3D volumetric CNN methods. In addition, the influence of views and depths of the 3D model on the classification performance is investigated through extensive experiments applied to the ModelNet40 database. Our deep learning framework runs significantly faster and consumes less memory than full voxel representations and demonstrate the effectiveness of our octree-based auxiliary learning approach for exploring high resolution 3D models. Experimental results reveal the superiority of our MV-DNN that achieves better classification accuracy compared to state-of-art methods on two public databases

    Multimodal Biomedical Data Visualization: Enhancing Network, Clinical, and Image Data Depiction

    Get PDF
    In this dissertation, we present visual analytics tools for several biomedical applications. Our research spans three types of biomedical data: reaction networks, longitudinal multidimensional clinical data, and biomedical images. For each data type, we present intuitive visual representations and efficient data exploration methods to facilitate visual knowledge discovery. Rule-based simulation has been used for studying complex protein interactions. In a rule-based model, the relationships of interacting proteins can be represented as a network. Nevertheless, understanding and validating the intended behaviors in large network models are ineffective and error prone. We have developed a tool that first shows a network overview with concise visual representations and then shows relevant rule-specific details on demand. This strategy significantly improves visualization comprehensibility and disentangles the complex protein-protein relationships by showing them selectively alongside the global context of the network. Next, we present a tool for analyzing longitudinal multidimensional clinical datasets, that we developed for understanding Parkinson's disease progression. Detecting patterns involving multiple time-varying variables is especially challenging for clinical data. Conventional computational techniques, such as cluster analysis and dimension reduction, do not always generate interpretable, actionable results. Using our tool, users can select and compare patient subgroups by filtering patients with multiple symptoms simultaneously and interactively. Unlike conventional visualizations that use local features, many targets in biomedical images are characterized by high-level features. We present our research characterizing such high-level features through multiscale texture segmentation and deep-learning strategies. First, we present an efficient hierarchical texture segmentation approach that scales up well to gigapixel images to colorize electron microscopy (EM) images. This enhances visual comprehensibility of gigapixel EM images across a wide range of scales. Second, we use convolutional neural networks (CNNs) to automatically derive high-level features that distinguish cell states in live-cell imagery and voxel types in 3D EM volumes. In addition, we present a CNN-based 3D segmentation method for biomedical volume datasets with limited training samples. We use factorized convolutions and feature-level augmentations to improve model generalization and avoid overfitting

    Mixing Deep Networks and Entangled Forests for the Semantic Segmentation of 3D Indoor Scenes

    Get PDF
    This work focuses on semantic segmentation over indoor 3D data, that is, to assign labels to every point in the point clouds representing working spaces: after researching the current state of the art, traditional approaches like random forests and deep neural networks based on PointNet are evaluated. The Superpoint Graph architecture and the 3D Entangled Forests algorithm are selected for mixing their features to try to enhance their performance

    Integrating Deep Learning into Digital Rock Analysis Workflow

    Full text link
    Digital Rock Analysis (DRA) has expanded our knowledge about natural phenomena in various geoscience specialties. DRA as an emerging technology has limitations including (1) the trade-off between the size of spatial domain and resolution, (2) methodological and human-induced errors in segmentation, and (3) the computational costs associated with intensive modeling. Deep learning (DL) methods are utilized to alleviate these limitations. First, two DL frameworks are utilized to probe the performance gains from using Convolutional Neural Networks (CNN) to super-resolve and segment real multi-resolution X-ray images of complex carbonate rocks. The first framework experiments the applications of U-Net and U-ResNet architectures to obtain macropore, solid, and micropore segmented images in an end-to-end scheme. The second framework segregates the super-resolution and segmentation into two networks: EDSR and U-ResNet. Both frameworks show consistent performance indicated by the voxel-wise accuracy metrics, the measured phase morphology, and flow characteristics. The end-to-end frameworks are shown to be superior to using a segregated approach confirming the adequacy of end-to-end learning for performing complex tasks. Second, CNNs accuracy margins in estimating physical properties of porous media 2d X-ray images are investigated. Binary and greyscale sandstone images are used as an input to CNNs architectures to estimate porosity, specific surface area, and average pore size of three sandstone images. The results show encouraging margins of accuracy where the error in estimating these properties can be up to 6% when using binary images and up to 7% when using greyscale images. Third, the suitability of CNNs as regression tools to predict a more challenging property, permeability, is investigated. Two complex CNNs architectures (ResNet and ResNext) are applied to learn the morphology of pore space in 3D porous media images for flow-based characterization. The dataset includes more than 29,000 3d subvolumes of multiple sandstone and carbonates rocks. The findings show promising regression accuracy using binary images. Accuracy gains are observed using conductivity maps as an input to the networks. Permeability inference on unseen samples can be achieved in 120 ms/sample with an average relative error of 18.9%. This thesis demonstrates the significant potential of deep learning in improving DRA capabilities

    Proceedings of the 2020 Joint Workshop of Fraunhofer IOSB and Institute for Anthropomatics, Vision and Fusion Laboratory

    Get PDF
    In 2020 fand der jährliche Workshop des Faunhofer IOSB und the Lehrstuhls für interaktive Echtzeitsysteme statt. Vom 27. bis zum 31. Juli trugen die Doktorranden der beiden Institute über den Stand ihrer Forschung vor in Themen wie KI, maschinellen Lernen, computer vision, usage control, Metrologie vor. Die Ergebnisse dieser Vorträge sind in diesem Band als technische Berichte gesammelt

    Learning Equivariant Representations

    Get PDF
    State-of-the-art deep learning systems often require large amounts of data and computation. For this reason, leveraging known or unknown structure of the data is paramount. Convolutional neural networks (CNNs) are successful examples of this principle, their defining characteristic being the shift-equivariance. By sliding a filter over the input, when the input shifts, the response shifts by the same amount, exploiting the structure of natural images where semantic content is independent of absolute pixel positions. This property is essential to the success of CNNs in audio, image and video recognition tasks. In this thesis, we extend equivariance to other kinds of transformations, such as rotation and scaling. We propose equivariant models for different transformations defined by groups of symmetries. The main contributions are (i) polar transformer networks, achieving equivariance to the group of similarities on the plane, (ii) equivariant multi-view networks, achieving equivariance to the group of symmetries of the icosahedron, (iii) spherical CNNs, achieving equivariance to the continuous 3D rotation group, (iv) cross-domain image embeddings, achieving equivariance to 3D rotations for 2D inputs, and (v) spin-weighted spherical CNNs, generalizing the spherical CNNs and achieving equivariance to 3D rotations for spherical vector fields. Applications include image classification, 3D shape classification and retrieval, panoramic image classification and segmentation, shape alignment and pose estimation. What these models have in common is that they leverage symmetries in the data to reduce sample and model complexity and improve generalization performance. The advantages are more significant on (but not limited to) challenging tasks where data is limited or input perturbations such as arbitrary rotations are present
    corecore