114 research outputs found

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies

    Get PDF
    Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot

    Understanding a Dynamic World: Dynamic Motion Estimation for Autonomous Driving Using LIDAR

    Full text link
    In a society that is heavily reliant on personal transportation, autonomous vehicles present an increasingly intriguing technology. They have the potential to save lives, promote efficiency, and enable mobility. However, before this vision becomes a reality, there are a number of challenges that must be solved. One key challenge involves problems in dynamic motion estimation, as it is critical for an autonomous vehicle to have an understanding of the dynamics in its environment for it to operate safely on the road. Accordingly, this thesis presents several algorithms for dynamic motion estimation for autonomous vehicles. We focus on methods using light detection and ranging (LIDAR), a prevalent sensing modality used by autonomous vehicle platforms, due to its advantages over other sensors, such as cameras, including lighting invariance and fidelity of 3D geometric data. First, we propose a dynamic object tracking algorithm. The proposed method takes as input a stream of LIDAR data from a moving object collected by a multi-sensor platform. It generates an estimate of its trajectory over time and a point cloud model of its shape. We formulate the problem similarly to simultaneous localization and mapping (SLAM), allowing us to leverage existing techniques. Unlike prior work, we properly handle a stream of sensor measurements observed over time by deriving our algorithm using a continuous-time estimation framework. We evaluate our proposed method on a real-world dataset that we collect. Second, we present a method for scene flow estimation from a stream of LIDAR data. Inspired by optical flow and scene flow from the computer vision community, our framework can estimate dynamic motion in the scene without relying on segmentation and data association while still rivaling the results of state-of-the-art object tracking methods. We design our algorithms to exploit a graphics processing unit (GPU), enabling real-time performance. Third, we leverage deep learning tools to build a feature learning framework that allows us to train an encoding network to estimate features from a LIDAR occupancy grid. The learned feature space describes the geometric and semantic structure of any location observed by the LIDAR data. We formulate the training process so that distances in this learned feature space are meaningful in comparing the similarity of different locations. Accordingly, we demonstrate that using this feature space improves our estimate of the dynamic motion in the environment over time. In summary, this thesis presents three methods to aid in understanding a dynamic world for autonomous vehicle applications with LIDAR. These methods include a novel object tracking algorithm, a real-time scene flow estimation method, and a feature learning framework to aid in dynamic motion estimation. Furthermore, we demonstrate the performance of all our proposed methods on a collection of real-world datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147587/1/aushani_1.pd

    Learning robust and efficient point cloud representations

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Deep learning-based improvement for the outcomes of glaucoma clinical trials

    Get PDF
    Glaucoma is the leading cause of irreversible blindness worldwide. It is a progressive optic neuropathy in which retinal ganglion cell (RGC) axon loss, probably as a consequence of damage at the optic disc, causes a loss of vision, predominantly affecting the mid-peripheral visual field (VF). Glaucoma results in a decrease in vision-related quality of life and, therefore, early detection and evaluation of disease progression rates is crucial in order to assess the risk of functional impairment and to establish sound treatment strategies. The aim of my research is to improve glaucoma diagnosis by enhancing state of the art analyses of glaucoma clinical trial outcomes using advanced analytical methods. This knowledge would also help better design and analyse clinical trials, providing evidence for re-evaluating existing medications, facilitating diagnosis and suggesting novel disease management. To facilitate my objective methodology, this thesis provides the following contributions: (i) I developed deep learning-based super-resolution (SR) techniques for optical coherence tomography (OCT) image enhancement and demonstrated that using super-resolved images improves the statistical power of clinical trials, (ii) I developed a deep learning algorithm for segmentation of retinal OCT images, showing that the methodology consistently produces more accurate segmentations than state-of-the-art networks, (iii) I developed a deep learning framework for refining the relationship between structural and functional measurements and demonstrated that the mapping is significantly improved over previous techniques, iv) I developed a probabilistic method and demonstrated that glaucomatous disc haemorrhages are influenced by a possible systemic factor that makes both eyes bleed simultaneously. v) I recalculated VF slopes, using the retinal never fiber layer thickness (RNFLT) from the super-resolved OCT as a Bayesian prior and demonstrated that use of VF rates with the Bayesian prior as the outcome measure leads to a reduction in the sample size required to distinguish treatment arms in a clinical trial

    Data-Driven Deep Learning-Based Analysis on THz Imaging

    Get PDF
    Breast cancer affects about 12.5% of women population in the United States. Surgical operations are often needed post diagnosis. Breast conserving surgery can help remove malignant tumors while maximizing the remaining healthy tissues. Due to lacking effective real-time tumor analysis tools and a unified operation standard, re-excision rate could be higher than 30% among breast conserving surgery patients. This results in significant physical, physiological, and financial burdens to those patients. This work designs deep learning-based segmentation algorithms that detect tissue type in excised tissues using pulsed THz technology. This work evaluates the algorithms for tissue type classification task among freshly excised tumor samples. Freshly excised tumor samples are more challenging than formalin-fixed, paraffin-embedded (FFPE) block sample counterparts due to excessive fluid, image registration difficulties, and lacking trustworthy pixelwise labels of each tissue sample. Additionally, evaluating freshly excised tumor samples has profound meaning of potentially applying pulsed THz scan technology to breast conserving cancer surgery in operating room. Recently, deep learning techniques have been heavily researched since GPU based computation power becomes economical and stronger. This dissertation revisits breast cancer tissue segmentation related problems using pulsed terahertz wave scan technique among murine samples and applies recent deep learning frameworks to enhance the performance in various tasks. This study first performs pixelwise classification on terahertz scans with CNN-based neural networks and time-frequency based feature tensors using wavelet transformation. This study then explores the neural network based semantic segmentation strategy performing on terahertz scans considering spatial information and incorporating noisy label handling with label correction techniques. Additionally, this study performs resolution restoration for visual enhancement on terahertz scans using an unsupervised, generative image-to-image translation methodology. This work also proposes a novel data processing pipeline that trains a semantic segmentation network using only neural generated synthetic terahertz scans. The performance is evaluated using various evaluation metrics among different tasks
    • …
    corecore