10 research outputs found

    Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination

    Full text link
    Although unsupervised feature learning has demonstrated its advantages to reducing the workload of data labeling and network design in many fields, existing unsupervised 3D learning methods still cannot offer a generic network for various shape analysis tasks with competitive performance to supervised methods. In this paper, we propose an unsupervised method for learning a generic and efficient shape encoding network for different shape analysis tasks. The key idea of our method is to jointly encode and learn shape and point features from unlabeled 3D point clouds. For this purpose, we adapt HR-Net to octree-based convolutional neural networks for jointly encoding shape and point features with fused multiresolution subnetworks and design a simple-yet-efficient Multiresolution Instance Discrimination (MID) loss for jointly learning the shape and point features. Our network takes a 3D point cloud as input and output both shape and point features. After training, the network is concatenated with simple task-specific back-end layers and fine-tuned for different shape analysis tasks. We evaluate the efficacy and generality of our method and validate our network and loss design with a set of shape analysis tasks, including shape classification, semantic shape segmentation, as well as shape registration tasks. With simple back-ends, our network demonstrates the best performance among all unsupervised methods and achieves competitive performance to supervised methods, especially in tasks with a small labeled dataset. For fine-grained shape segmentation, our method even surpasses existing supervised methods by a large margin.Comment: Accepted by AAAI 2021. Code: https://github.com/microsoft/O-CNN/blob/master/docs/unsupervised.m

    Exploring deep learning techniques for wild animal behaviour classification using animal-borne accelerometers

    Get PDF
    Otsuka R., Yoshimura N., Tanigaki K., et al. Exploring deep learning techniques for wild animal behaviour classification using animal-borne accelerometers. Methods in Ecology and Evolution 15, 716 (2024); https://doi.org/10.1111/2041-210X.14294.Machine learning-based behaviour classification using acceleration data is a powerful tool in bio-logging research. Deep learning architectures such as convolutional neural networks (CNN), long short-term memory (LSTM) and self-attention mechanism as well as related training techniques have been extensively studied in human activity recognition. However, they have rarely been used in wild animal studies. The main challenges of acceleration-based wild animal behaviour classification include data shortages, class imbalance problems, various types of noise in data due to differences in individual behaviour and where the loggers were attached and complexity in data due to complex animal-specific behaviours, which may have limited the application of deep learning techniques in this area. To overcome these challenges, we explored the effectiveness of techniques for efficient model training: data augmentation, manifold mixup and pre-training of deep learning models with unlabelled data, using datasets from two species of wild seabirds and state-of-the-art deep learning model architectures. Data augmentation improved the overall model performance when one of the various techniques (none, scaling, jittering, permutation, time-warping and rotation) was randomly applied to each data during mini-batch training. Manifold mixup also improved model performance, but not as much as random data augmentation. Pre-training with unlabelled data did not improve model performance. The state-of-the-art deep learning models, including a model consisting of four CNN layers, an LSTM layer and a multi-head attention layer, as well as its modified version with shortcut connection, showed better performance among other comparative models. Using only raw acceleration data as inputs, these models outperformed classic machine learning approaches that used 119 handcrafted features. Our experiments showed that deep learning techniques are promising for acceleration-based behaviour classification of wild animals and highlighted some challenges (e.g. effective use of unlabelled data). There is scope for greater exploration of deep learning techniques in wild animal studies (e.g. advanced data augmentation, multimodal sensor data use, transfer learning and self-supervised learning). We hope that this study will stimulate the development of deep learning techniques for wild animal behaviour classification using time-series sensor data

    Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies

    Get PDF
    Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot

    Integrated Graph Theoretic, Radiomics, and Deep Learning Framework for Personalized Clinical Diagnosis, Prognosis, and Treatment Response Assessment of Body Tumors

    Get PDF
    Purpose: A new paradigm is beginning to emerge in radiology with the advent of increased computational capabilities and algorithms. The future of radiological reading rooms is heading towards a unique collaboration between computer scientists and radiologists. The goal of computational radiology is to probe the underlying tissue using advanced algorithms and imaging parameters and produce a personalized diagnosis that can be correlated to pathology. This thesis presents a complete computational radiology framework (I GRAD) for personalized clinical diagnosis, prognosis and treatment planning using an integration of graph theory, radiomics, and deep learning. Methods: There are three major components of the I GRAD framework–image segmentation, feature extraction, and clinical decision support. Image Segmentation: I developed the multiparametric deep learning (MPDL) tissue signature model for segmentation of normal and abnormal tissue from multiparametric (mp) radiological images. The segmentation MPDL network was constructed from stacked sparse autoencoders (SSAE) with five hidden layers. The MPDL network parameters were optimized using k-fold cross-validation. In addition, the MPDL segmentation network was tested on an independent dataset. Feature Extraction: I developed the radiomic feature mapping (RFM) and contribution scattergram (CSg) methods for characterization of spatial and inter-parametric relationships in multiparametric imaging datasets. The radiomic feature maps were created by filtering radiological images with first and second order statistical texture filters followed by the development of standardized features for radiological correlation to biology and clinical decision support. The contribution scattergram was constructed to visualize and understand the inter-parametric relationships of the breast MRI as a complex network. This multiparametric imaging complex network was modeled using manifold learning and evaluated using graph theoretic analysis. Feature Integration: The different clinical and radiological features extracted from multiparametric radiological images and clinical records were integrated using a hybrid multiview manifold learning technique termed the Informatics Radiomics Integration System (IRIS). IRIS uses hierarchical clustering in combination with manifold learning to visualize the high-dimensional patient space on a two-dimensional heatmap. The heatmap highlights the similarity and dissimilarity between different patients and variables. Results: All the algorithms and techniques presented in this dissertation were developed and validated using breast cancer as a model for diagnosis and prognosis using multiparametric breast magnetic resonance imaging (MRI). The deep learning MPDL method demonstrated excellent dice similarity of 0.87±0.05 and 0.84±0.07 for segmentation of lesions on malignant and benign breast patients, respectively. Furthermore, each of the methods, MPDL, RFM, and CSg demonstrated excellent results for breast cancer diagnosis with area under the receiver (AUC) operating characteristic (ROC) curve of 0.85, 0.91, and 0.87, respectively. Furthermore, IRIS classified patients with low risk of breast cancer recurrence from patients with medium and high risk with an AUC of 0.93 compared to OncotypeDX, a 21 gene assay for breast cancer recurrence. Conclusion: By integrating advanced computer science methods into the radiological setting, the I-GRAD framework presented in this thesis can be used to model radiological imaging data in combination with clinical and histopathological data and produce new tools for personalized diagnosis, prognosis or treatment planning by physicians