670 research outputs found
Advancing efficiency and robustness of neural networks for imaging
Enabling machines to see and analyze the world is a longstanding research objective. Advances in computer vision have the potential of influencing many aspects of our lives as they can enable machines to tackle a variety of tasks. Great progress in computer vision has been made, catalyzed by recent progress in machine learning and especially the breakthroughs achieved by deep artificial neural networks.
Goal of this work is to alleviate limitations of deep neural networks that hinder their large-scale adoption for real-world applications. To this end, it investigates methodologies for constructing and training deep neural networks with low computational requirements. Moreover, it explores strategies for achieving robust performance on unseen data. Of particular interest is the application of segmenting volumetric medical scans because of the technical challenges it imposes, as well as its clinical importance. The developed methodologies are generic and of relevance to a broader computer vision and machine learning audience.
More specifically, this work introduces an efficient 3D convolutional neural network architecture, which achieves high performance for segmentation of volumetric medical images, an application previously hindered by high computational requirements of 3D networks. It then investigates sensitivity of network performance on hyper-parameter configuration, which we interpret as overfitting the model configuration to the data available during development. It is shown that ensembling a set of models with diverse configurations mitigates this and improves generalization. The thesis then explores how to utilize unlabelled data for learning representations that generalize better. It investigates domain adaptation and introduces an architecture for adversarial networks tailored for adaptation of segmentation networks. Finally, a novel semi-supervised learning method is proposed that introduces a graph in the latent space of a neural network to capture relations between labelled and unlabelled samples. It then regularizes the embedding to form a compact cluster per class, which improves generalization.Open Acces
Data driven approaches for investigating molecular heterogeneity of the brain
It has been proposed that one of the clearest organizing principles for most sensory systems is the existence of parallel subcircuits and processing streams that form orderly and systematic mappings from stimulus space to neurons. Although the spatial heterogeneity of the early olfactory circuitry has long been recognized, we know comparatively little about the circuits that propagate sensory signals downstream. Investigating the potential modularity of the bulbās intrinsic circuits proves to be a difficult task as termination patterns of converging projections, as with the bulbās inputs, are not feasibly realized. Thus, if such circuit motifs exist, their detection essentially relies on identifying differential gene expression, or āmolecular signatures,ā that may demarcate functional subregions. With the arrival of comprehensive (whole genome, cellular resolution) datasets in biology and neuroscience, it is now possible for us to carry out large-scale investigations and make particular use of the densely catalogued, whole genome expression maps of the Allen Brain Atlas to carry out systematic investigations of the molecular topography of the olfactory bulbās intrinsic circuits. To address the challenges associated with high-throughput and high-dimensional datasets, a deep learning approach will form the backbone of our informatic pipeline. In the proposed work, we test the hypothesis that the bulbās intrinsic circuits are parceled into distinct, parallel modules that can be defined by genome-wide patterns of expression. In pursuit of this aim, our deep learning framework will facilitate the group-registration of the mitral cell layers of ~ 50,000 in-situ olfactory bulb circuits to test this hypothesis
Methods for the analysis and characterization of brain morphology from MRI images
Brain magnetic resonance imaging (MRI) is an imaging modality that produces
detailed images of the brain without using any ionizing radiation.
From a structural MRI scan, it is possible to extract morphological properties
of different brain regions, such as their volume and shape. These measures
can both allow a better understanding of how the brain changes due
to multiple factors (e.g., environmental and pathological) and contribute to
the identification of new imaging biomarkers of neurological and psychiatric
diseases. The overall goal of the present thesis is to advance the knowledge
on how brain MRI image processing can be effectively used to analyze and
characterize brain structure.
The first two works presented in this thesis are animal studies that primarily
aim to use MRI data for analyzing differences between groups of
interest. In Paper I, MRI scans from wild and domestic rabbits were processed
to identify structural brain differences between these two groups.
Domestication was found to significantly reshape brain structure in terms
of both regional gray matter volume and white matter integrity. In Paper II,
rat brain MRI scans were used to train a brain age prediction model. This
model was then tested on both controls and a group of rats that underwent
long-term environmental enrichment and dietary restriction. This healthy
lifestyle intervention was shown to significantly affect the predicted brain
age trajectories by slowing the ratsā aging process compared to controls.
Furthermore, brain age predicted on young adult rats was found to have a
significant effect on survival.
Papers III to V are human studies that propose deep learning-based
methods for segmenting brain structures that can be severely affected by
neurodegeneration. In particular, Papers III and IV focus on U-Net-based
2D segmentation of the corpus callosum (CC) in multiple sclerosis (MS)
patients. In both studies, good segmentation accuracy was obtained and a
significant correlation was found between CC area and the patientās level of
cognitive and physical disability. Additionally, in Paper IV, shape analysis
of the segmented CC revealed a significant association between disability
and both CC thickness and bending angle. Conversely, in Paper V, a novel
method for automatic segmentation of the hippocampus is proposed, which
consists of embedding a statistical shape prior as context information into
a U-Net-based framework. The inclusion of shape information was shown
to significantly improve segmentation accuracy when testing the method
on a new unseen cohort (i.e., different from the one used for training).
Furthermore, good performance was observed across three different diagnostic
groups (healthy controls, subjects with mild cognitive impairment
and Alzheimerās patients) that were characterized by different levels of hippocampal
atrophy.
In summary, the studies presented in this thesis support the great value
of MRI image analysis for the advancement of neuroscientific knowledge,
and their contribution is mostly two-fold. First, by applying well-established
processing methods on datasets that had not yet been explored in the literature,
it was possible to characterize specific brain changes and disentangle
relevant problems of a clinical or biological nature. Second, a technical
contribution is provided by modifying and extending already-existing brain
image processing methods to achieve good performance on new datasets
A PhD Dissertation on Road Topology Classification for Autonomous Driving
La clasificaciĀ“on de la topologĀ“ıa de la carretera es un punto clave si queremos desarrollar
sistemas de conducciĀ“on autĀ“onoma completos y seguros. Es lĀ“ogico pensar que la comprensi
Ā“on de forma exhaustiva del entorno que rodea al vehiculo, tal como sucede cuando es
un ser humano el que toma las decisiones al volante, es una condiciĀ“on indispensable si se
quiere avanzar en la consecuciĀ“on de vehĀ“ıculos autĀ“onomos de nivel 4 o 5. Si el conductor,
ya sea un sistema autĀ“onomo, como un ser humano, no tiene acceso a la informaciĀ“on del
entorno la disminuciĀ“on de la seguridad es crĀ“ıtica y el accidente es casi instantĀ“aneo i.e.,
cuando un conductor se duerme al volante.
A lo largo de esta tesis doctoral se presentan sendos sistemas basados en deep leaning
que ayudan al sistema de conducciĀ“on autĀ“onoma a comprender el entorno en el que se
encuentra en ese instante. El primero de ellos 3D-Deep y su optimizaciĀ“on 3D-Deepest,
es una nueva arquitectura de red para la segmentaciĀ“on semĀ“antica de carretera en el que
se integran fuentes de datos de diferente tipologĀ“ıa. La segmentaciĀ“on de carretera es clave
en un vehĀ“ıculo autĀ“onomo, ya que es el medio por el que deberĀ“ıa circular en el 99,9% de
los casos. El segundo es un sistema de clasificaciĀ“on de intersecciones urbanas mediante
diferentes enfoques comprendidos dentro del metric-learning, la integraciĀ“on temporal y la
generaciĀ“on de imĀ“agenes sintĀ“eticas. La seguridad es un punto clave en cualquier sistema
autĀ“onomo, y si es de conducciĀ“on aĀ“un mĀ“as. Las intersecciones son uno de los lugares dentro
de las ciudades donde la seguridad es crĀ“ıtica. Los coches siguen trayectorias secantes y por
tanto pueden colisionar, la mayorĀ“ıa de ellas son usadas por los peatones para atravesar
la vĀ“ıa independientemente de si existen pasos de cebra o no, lo que incrementa de forma
alarmante los riesgos de atropello y colisiĀ“on.
La implementaciĀ“on de la combinaciĀ“on de ambos sistemas mejora substancialmente la
comprensiĀ“on del entorno, y puede considerarse que incrementa la seguridad, allanando el
camino en la investigaciĀ“on hacia un vehĀ“ıculo completamente autĀ“onomo.Road topology classification is a crucial point if we want to develop complete and safe
autonomous driving systems. It is logical to think that a thorough understanding of
the environment surrounding the ego-vehicle, as it happens when a human being is a
decision-maker at the wheel, is an indispensable condition if we want to advance in the
achievement of level 4 or 5 autonomous vehicles. If the driver, either an autonomous
system or a human being, does not have access to the information of the environment,
the decrease in safety is critical, and the accident is almost instantaneous, i.e., when a
driver falls asleep at the wheel.
Throughout this doctoral thesis, we present two deep learning systems that will help
an autonomous driving system understand the environment in which it is at that instant.
The first one, 3D-Deep and its optimization 3D-Deepest, is a new network architecture
for semantic road segmentation in which data sources of different types are integrated.
Road segmentation is vital in an autonomous vehicle since it is the medium on which
it should drive in 99.9% of the cases. The second is an urban intersection classification
system using different approaches comprised of metric-learning, temporal integration, and
synthetic image generation. Safety is a crucial point in any autonomous system, and if it
is a driving system, even more so. Intersections are one of the places within cities where
safety is critical. Cars follow secant trajectories and therefore can collide; most of them
are used by pedestrians to cross the road regardless of whether there are crosswalks or
not, which alarmingly increases the risks of being hit and collision.
The implementation of the combination of both systems substantially improves the
understanding of the environment and can be considered to increase safety, paving the
way in the research towards a fully autonomous vehicle
Symbiotic deep learning for medical image analysis with applications in real-time diagnosis for fetal ultrasound screening
The last hundred years have seen a monumental rise in the power and capability of machines to
perform intelligent tasks in the stead of previously human operators. This rise is not expected
to slow down any time soon and what this means for society and humanity as a whole remains
to be seen. The overwhelming notion is that with the right goals in mind, the growing influence
of machines on our every day tasks will enable humanity to give more attention to the truly
groundbreaking challenges that we all face together. This will usher in a new age of human
machine collaboration in which humans and machines may work side by side to achieve greater
heights for all of humanity. Intelligent systems are useful in isolation, but the true benefits of
intelligent systems come to the fore in complex systems where the interaction between humans
and machines can be made seamless, and it is this goal of symbiosis between human and machine
that may democratise complex knowledge, which motivates this thesis. In the recent past, datadriven
methods have come to the fore and now represent the state-of-the-art in many different
fields. Alongside the shift from rule-based towards data-driven methods we have also seen a
shift in how humans interact with these technologies. Human computer interaction is changing
in response to data-driven methods and new techniques must be developed to enable the same
symbiosis between man and machine for data-driven methods as for previous formula-driven
technology.
We address five key challenges which need to be overcome for data-driven human-in-the-loop
computing to reach maturity. These are (1) the āCategorisation Challengeā where we examine
existing work and form a taxonomy of the different methods being utilised for data-driven
human-in-the-loop computing; (2) the āConfidence Challengeā, where data-driven methods must
communicate interpretable beliefs in how confident their predictions are; (3) the āComplexity
Challengeā where the aim of reasoned communication becomes increasingly important as the
complexity of tasks and methods to solve also increases; (4) the āClassification Challengeā in
which we look at how complex methods can be separated in order to provide greater reasoning
in complex classification tasks; and finally (5) the āCuration Challengeā where we challenge the
assumptions around bottleneck creation for the development of supervised learning methods.Open Acces
Autoencoding sensory substitution
Tens of millions of people live blind, and their number is ever increasing. Visual-to-auditory sensory substitution (SS) encompasses a family of cheap, generic solutions to assist the visually impaired by conveying visual information through sound. The required SS training is lengthy: months of effort is necessary to reach a practical level of adaptation. There are two reasons for the tedious training process: the elongated substituting audio signal, and the disregard for the compressive characteristics of the human hearing system.
To overcome these obstacles, we developed a novel class of SS methods, by training deep recurrent autoencoders for image-to-sound conversion. We successfully trained deep learning models on different datasets to execute visual-to-auditory stimulus conversion. By constraining the visual space, we demonstrated the viability of shortened substituting audio signals, while proposing mechanisms, such as the integration of computational hearing models, to optimally convey visual features in the substituting stimulus as perceptually discernible auditory components. We tested our approach in two separate cases. In the first experiment, the author went blindfolded for 5 days, while performing SS training on hand posture discrimination. The second experiment assessed the accuracy of reaching movements towards objects on a table. In both test cases, above-chance-level accuracy was attained after a few hours of training.
Our novel SS architecture broadens the horizon of rehabilitation methods engineered for the visually impaired. Further improvements on the proposed model shall yield hastened rehabilitation of the blind and a wider adaptation of SS devices as a consequence
Task-specific and interpretable feature learning
Deep learning models have had tremendous impacts in recent years, while a question has been raised by many: Is deep learning just a triumph of empiricism? There has been emerging interest in reducing the gap between the theoretical soundness and interpretability, and the empirical success of deep models. This dissertation provides a comprehensive discussion on bridging traditional model-based learning approaches that emphasize problem-specific reasoning, and deep models that allow for larger learning capacity. The overall goal is to devise the next-generation feature learning architectures that are: 1) task-specific, namely, optimizing the entire pipeline from end to end while taking advantage of available prior knowledge and domain expertise; and 2) interpretable, namely, being able to learn a representation consisting of semantically sensible variables, and to display predictable behaviors.
This dissertation starts by showing how the classical sparse coding models could be improved in a task-specific way, by formulating the entire pipeline as bi-level optimization. Then, it mainly illustrates how to incorporate the structure of classical learning models, e.g., sparse coding, into the design of deep architectures. A few concrete model examples are presented, ranging from the and sparse approximation models, to the constrained model and the dual-sparsity model. The analytic tools in the optimization problems can be translated to guide the architecture design and performance analysis of deep models. As a result, those customized deep models demonstrate improved performance, intuitive interpretation, and efficient parameter initialization. On the other hand, deep networks are shown to be analogous to brain mechanisms. They exhibit the ability to describe semantic content from the primitive level to the abstract level. This dissertation thus also presents a preliminary investigation of the synergy between feature learning with cognitive science and neuroscience. Two novel application domains, image aesthetics assessment and brain encoding, are explored, with promising preliminary results achieved
- ā¦