114 research outputs found
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Deep Learning in Cardiology
The medical field is creating large amount of data that physicians are unable
to decipher and use efficiently. Moreover, rule-based expert systems are
inefficient in solving complicated medical tasks or for creating insights using
big data. Deep learning has emerged as a more accurate and effective technology
in a wide range of medical problems such as diagnosis, prediction and
intervention. Deep learning is a representation learning method that consists
of layers that transform the data non-linearly, thus, revealing hierarchical
relationships and structures. In this review we survey deep learning
application papers that use structured data, signal and imaging modalities from
cardiology. We discuss the advantages and limitations of applying deep learning
in cardiology that also apply in medicine in general, while proposing certain
directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table
Recommended from our members
Augmented Deep Learning Techniques for Robotic State Estimation
While robotic systems may have once been relegated to structured environments and automation style tasks, in recent years these boundaries have begun to erode. As robots begin to operate in largely unstructured environments, it becomes more difficult for them to effectively interpret their surroundings. As sensor technology improves, the amount of data these robots must utilize can quickly become intractable. Additional challenges include environmental noise, dynamic obstacles, and inherent sensor non-linearities. Deep learning techniques have emerged as a way to efficiently deal with these challenges. While end-to-end deep learning can be convenient, challenges such as validation and training requirements can be prohibitive to its use.
In order to address these issues, we propose augmenting the power of deep learning techniques with tools such as optimization methods, physics based models, and human expertise. In this work, we present a principled framework for approaching a problem that allows a user to identify the types of augmentation methods and deep learning techniques best suited to their problem. To validate our framework, we consider three different domains: LIDAR based odometry estimation, hybrid soft robotic control, and sonar based underwater mapping.
First, we investigate LIDAR based odometry estimation which can be characterized with both high data precision and availability; ideal for augmenting with optimization methods. We propose using denoising autoencoders (DAEs) to address the challenges presented by modern LIDARs. Our proposed approach is comprised of two stages: a novel pre-processing stage for robust feature identification and a scan matching stage for motion estimation. Using real-world data from the University of Michigan North Campus long-term vision and LIDAR dataset (NCLT dataset) as well as the KITTI dataset, we show that our approach generalizes across domains; is capable of reducing the per-estimate error of standard ICP methods on average by 25.5% for the translational component and 57.53% for the rotational component; and is capable of reducing the computation time of state-of-the-art ICP methods by a factor of 7.94 on average while achieving competitive performance.
Next, we consider hybrid soft robotic control which has lower data precision due to real-world noise (e.g., friction and manufacturing imperfections). Here, augmenting with model based methods is more appropriate. We present a novel approach for modeling, and classifying between, the system load states introduced when constructing staged soft arm configurations. Our proposed approach is comprised of two stages: an LSTM calibration routine used to identify the current load state and a control input generation step that combines a generalized quasistatic model with the learned load model. We show our method is capable of classifying between different arm configurations at a rate greater than 95%. Additionally, our method is capable of reducing the end-effector error of quasistatic model only control to within 1 cm of our controller baseline.
Finally, we examine sonar based underwater mapping. Here, data is so noisy that augmenting with human experts and incorporating some global context is required. We develop a novel framework that enables the real-time 3D reconstruction of underwater environments using features from 2D sonar images. In our approach, a convolutional neural network (CNN) analyzes sonar imagery in real-time and only proposes a small subset of high-quality frames to the human expert for feature annotation. We demonstrate that our approach provides real-time reconstruction capability without loss in classification performance on datasets captured onboard our underwater vehicle while operating in a variety of environments
Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies
Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot
Understanding a Dynamic World: Dynamic Motion Estimation for Autonomous Driving Using LIDAR
In a society that is heavily reliant on personal transportation, autonomous vehicles present an increasingly intriguing technology. They have the potential to save lives, promote efficiency, and enable mobility. However, before this vision becomes a reality, there are a number of challenges that must be solved. One key challenge involves problems in dynamic motion estimation, as it is critical for an autonomous vehicle to have an understanding of the dynamics in its environment for it to operate safely on the road. Accordingly, this thesis presents several algorithms for dynamic motion estimation for autonomous vehicles. We focus on methods using light detection and ranging (LIDAR), a prevalent sensing modality used by autonomous vehicle platforms, due to its advantages over other sensors, such as cameras, including lighting invariance and fidelity of 3D geometric data.
First, we propose a dynamic object tracking algorithm. The proposed method takes as input a stream of LIDAR data from a moving object collected by a multi-sensor platform. It generates an estimate of its trajectory over time and a point cloud model of its shape. We formulate the problem similarly to simultaneous localization and mapping (SLAM), allowing us to leverage existing techniques. Unlike prior work, we properly handle a stream of sensor measurements observed over time by deriving our algorithm using a continuous-time estimation framework. We evaluate our proposed method on a real-world dataset that we collect.
Second, we present a method for scene flow estimation from a stream of LIDAR data. Inspired by optical flow and scene flow from the computer vision community, our framework can estimate dynamic motion in the scene without relying on segmentation and data association while still rivaling the results of state-of-the-art object tracking methods. We design our algorithms to exploit a graphics processing unit (GPU), enabling real-time performance.
Third, we leverage deep learning tools to build a feature learning framework that allows us to train an encoding network to estimate features from a LIDAR occupancy grid. The learned feature space describes the geometric and semantic structure of any location observed by the LIDAR data. We formulate the training process so that distances in this learned feature space are meaningful in comparing the similarity of different locations. Accordingly, we demonstrate that using this feature space improves our estimate of the dynamic motion in the environment over time.
In summary, this thesis presents three methods to aid in understanding a dynamic world for autonomous vehicle applications with LIDAR. These methods include a novel object tracking algorithm, a real-time scene flow estimation method, and a feature learning framework to aid in dynamic motion estimation. Furthermore, we demonstrate the performance of all our proposed methods on a collection of real-world datasets.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147587/1/aushani_1.pd
Learning robust and efficient point cloud representations
L'abstract è presente nell'allegato / the abstract is in the attachmen
Deep learning-based improvement for the outcomes of glaucoma clinical trials
Glaucoma is the leading cause of irreversible blindness worldwide. It is a progressive optic neuropathy in which retinal ganglion cell (RGC) axon loss, probably as a consequence of damage at the optic disc, causes a loss of vision, predominantly affecting the mid-peripheral visual field (VF). Glaucoma results in a decrease in vision-related quality of life and, therefore, early detection and evaluation of disease progression rates is crucial in order to assess the risk of functional impairment and to establish sound treatment strategies. The aim of my research is to improve glaucoma diagnosis by enhancing state of the art analyses of glaucoma clinical trial outcomes using advanced analytical methods. This knowledge would also help better design and analyse clinical trials, providing evidence for re-evaluating existing medications, facilitating diagnosis and suggesting novel disease management.
To facilitate my objective methodology, this thesis provides the following contributions: (i) I developed deep learning-based super-resolution (SR) techniques for optical coherence tomography (OCT) image enhancement and demonstrated that using super-resolved images improves the statistical power of clinical trials, (ii) I developed a deep learning algorithm for segmentation of retinal OCT images, showing that the methodology consistently produces more accurate segmentations than state-of-the-art networks, (iii) I developed a deep learning framework for refining the relationship between structural and functional measurements and demonstrated that the mapping is significantly improved over previous techniques, iv) I developed a probabilistic method and demonstrated that glaucomatous disc haemorrhages are influenced by a possible systemic factor that makes both eyes bleed simultaneously. v) I recalculated VF slopes, using the retinal never fiber layer thickness (RNFLT) from the super-resolved OCT as a Bayesian prior and demonstrated that use of VF rates with the Bayesian prior as the outcome measure leads to a reduction in the sample size required to distinguish treatment arms in a clinical trial
Data-Driven Deep Learning-Based Analysis on THz Imaging
Breast cancer affects about 12.5% of women population in the United States. Surgical operations are often needed post diagnosis. Breast conserving surgery can help remove malignant tumors while maximizing the remaining healthy tissues. Due to lacking effective real-time tumor analysis tools and a unified operation standard, re-excision rate could be higher than 30% among breast conserving surgery patients. This results in significant physical, physiological, and financial burdens to those patients. This work designs deep learning-based segmentation algorithms that detect tissue type in excised tissues using pulsed THz technology. This work evaluates the algorithms for tissue type classification task among freshly excised tumor samples. Freshly excised tumor samples are more challenging than formalin-fixed, paraffin-embedded (FFPE) block sample counterparts due to excessive fluid, image registration difficulties, and lacking trustworthy pixelwise labels of each tissue sample. Additionally, evaluating freshly excised tumor samples has profound meaning of potentially applying pulsed THz scan technology to breast conserving cancer surgery in operating room. Recently, deep learning techniques have been heavily researched since GPU based computation power becomes economical and stronger. This dissertation revisits breast cancer tissue segmentation related problems using pulsed terahertz wave scan technique among murine samples and applies recent deep learning frameworks to enhance the performance in various tasks. This study first performs pixelwise classification on terahertz scans with CNN-based neural networks and time-frequency based feature tensors using wavelet transformation. This study then explores the neural network based semantic segmentation strategy performing on terahertz scans considering spatial information and incorporating noisy label handling with label correction techniques. Additionally, this study performs resolution restoration for visual enhancement on terahertz scans using an unsupervised, generative image-to-image translation methodology. This work also proposes a novel data processing pipeline that trains a semantic segmentation network using only neural generated synthetic terahertz scans. The performance is evaluated using various evaluation metrics among different tasks
- …