670 research outputs found

    Advancing efficiency and robustness of neural networks for imaging

    Get PDF
    Enabling machines to see and analyze the world is a longstanding research objective. Advances in computer vision have the potential of influencing many aspects of our lives as they can enable machines to tackle a variety of tasks. Great progress in computer vision has been made, catalyzed by recent progress in machine learning and especially the breakthroughs achieved by deep artificial neural networks. Goal of this work is to alleviate limitations of deep neural networks that hinder their large-scale adoption for real-world applications. To this end, it investigates methodologies for constructing and training deep neural networks with low computational requirements. Moreover, it explores strategies for achieving robust performance on unseen data. Of particular interest is the application of segmenting volumetric medical scans because of the technical challenges it imposes, as well as its clinical importance. The developed methodologies are generic and of relevance to a broader computer vision and machine learning audience. More specifically, this work introduces an efficient 3D convolutional neural network architecture, which achieves high performance for segmentation of volumetric medical images, an application previously hindered by high computational requirements of 3D networks. It then investigates sensitivity of network performance on hyper-parameter configuration, which we interpret as overfitting the model configuration to the data available during development. It is shown that ensembling a set of models with diverse configurations mitigates this and improves generalization. The thesis then explores how to utilize unlabelled data for learning representations that generalize better. It investigates domain adaptation and introduces an architecture for adversarial networks tailored for adaptation of segmentation networks. Finally, a novel semi-supervised learning method is proposed that introduces a graph in the latent space of a neural network to capture relations between labelled and unlabelled samples. It then regularizes the embedding to form a compact cluster per class, which improves generalization.Open Acces

    Data driven approaches for investigating molecular heterogeneity of the brain

    Get PDF
    It has been proposed that one of the clearest organizing principles for most sensory systems is the existence of parallel subcircuits and processing streams that form orderly and systematic mappings from stimulus space to neurons. Although the spatial heterogeneity of the early olfactory circuitry has long been recognized, we know comparatively little about the circuits that propagate sensory signals downstream. Investigating the potential modularity of the bulbā€™s intrinsic circuits proves to be a difficult task as termination patterns of converging projections, as with the bulbā€™s inputs, are not feasibly realized. Thus, if such circuit motifs exist, their detection essentially relies on identifying differential gene expression, or ā€œmolecular signatures,ā€ that may demarcate functional subregions. With the arrival of comprehensive (whole genome, cellular resolution) datasets in biology and neuroscience, it is now possible for us to carry out large-scale investigations and make particular use of the densely catalogued, whole genome expression maps of the Allen Brain Atlas to carry out systematic investigations of the molecular topography of the olfactory bulbā€™s intrinsic circuits. To address the challenges associated with high-throughput and high-dimensional datasets, a deep learning approach will form the backbone of our informatic pipeline. In the proposed work, we test the hypothesis that the bulbā€™s intrinsic circuits are parceled into distinct, parallel modules that can be defined by genome-wide patterns of expression. In pursuit of this aim, our deep learning framework will facilitate the group-registration of the mitral cell layers of ~ 50,000 in-situ olfactory bulb circuits to test this hypothesis

    Audio-Visual Egocentric Action Recognition

    Get PDF

    Methods for the analysis and characterization of brain morphology from MRI images

    Get PDF
    Brain magnetic resonance imaging (MRI) is an imaging modality that produces detailed images of the brain without using any ionizing radiation. From a structural MRI scan, it is possible to extract morphological properties of different brain regions, such as their volume and shape. These measures can both allow a better understanding of how the brain changes due to multiple factors (e.g., environmental and pathological) and contribute to the identification of new imaging biomarkers of neurological and psychiatric diseases. The overall goal of the present thesis is to advance the knowledge on how brain MRI image processing can be effectively used to analyze and characterize brain structure. The first two works presented in this thesis are animal studies that primarily aim to use MRI data for analyzing differences between groups of interest. In Paper I, MRI scans from wild and domestic rabbits were processed to identify structural brain differences between these two groups. Domestication was found to significantly reshape brain structure in terms of both regional gray matter volume and white matter integrity. In Paper II, rat brain MRI scans were used to train a brain age prediction model. This model was then tested on both controls and a group of rats that underwent long-term environmental enrichment and dietary restriction. This healthy lifestyle intervention was shown to significantly affect the predicted brain age trajectories by slowing the ratsā€™ aging process compared to controls. Furthermore, brain age predicted on young adult rats was found to have a significant effect on survival. Papers III to V are human studies that propose deep learning-based methods for segmenting brain structures that can be severely affected by neurodegeneration. In particular, Papers III and IV focus on U-Net-based 2D segmentation of the corpus callosum (CC) in multiple sclerosis (MS) patients. In both studies, good segmentation accuracy was obtained and a significant correlation was found between CC area and the patientā€™s level of cognitive and physical disability. Additionally, in Paper IV, shape analysis of the segmented CC revealed a significant association between disability and both CC thickness and bending angle. Conversely, in Paper V, a novel method for automatic segmentation of the hippocampus is proposed, which consists of embedding a statistical shape prior as context information into a U-Net-based framework. The inclusion of shape information was shown to significantly improve segmentation accuracy when testing the method on a new unseen cohort (i.e., different from the one used for training). Furthermore, good performance was observed across three different diagnostic groups (healthy controls, subjects with mild cognitive impairment and Alzheimerā€™s patients) that were characterized by different levels of hippocampal atrophy. In summary, the studies presented in this thesis support the great value of MRI image analysis for the advancement of neuroscientific knowledge, and their contribution is mostly two-fold. First, by applying well-established processing methods on datasets that had not yet been explored in the literature, it was possible to characterize specific brain changes and disentangle relevant problems of a clinical or biological nature. Second, a technical contribution is provided by modifying and extending already-existing brain image processing methods to achieve good performance on new datasets

    A PhD Dissertation on Road Topology Classification for Autonomous Driving

    Get PDF
    La clasificaciĀ“on de la topologĀ“ıa de la carretera es un punto clave si queremos desarrollar sistemas de conducciĀ“on autĀ“onoma completos y seguros. Es lĀ“ogico pensar que la comprensi Ā“on de forma exhaustiva del entorno que rodea al vehiculo, tal como sucede cuando es un ser humano el que toma las decisiones al volante, es una condiciĀ“on indispensable si se quiere avanzar en la consecuciĀ“on de vehĀ“ıculos autĀ“onomos de nivel 4 o 5. Si el conductor, ya sea un sistema autĀ“onomo, como un ser humano, no tiene acceso a la informaciĀ“on del entorno la disminuciĀ“on de la seguridad es crĀ“ıtica y el accidente es casi instantĀ“aneo i.e., cuando un conductor se duerme al volante. A lo largo de esta tesis doctoral se presentan sendos sistemas basados en deep leaning que ayudan al sistema de conducciĀ“on autĀ“onoma a comprender el entorno en el que se encuentra en ese instante. El primero de ellos 3D-Deep y su optimizaciĀ“on 3D-Deepest, es una nueva arquitectura de red para la segmentaciĀ“on semĀ“antica de carretera en el que se integran fuentes de datos de diferente tipologĀ“ıa. La segmentaciĀ“on de carretera es clave en un vehĀ“ıculo autĀ“onomo, ya que es el medio por el que deberĀ“ıa circular en el 99,9% de los casos. El segundo es un sistema de clasificaciĀ“on de intersecciones urbanas mediante diferentes enfoques comprendidos dentro del metric-learning, la integraciĀ“on temporal y la generaciĀ“on de imĀ“agenes sintĀ“eticas. La seguridad es un punto clave en cualquier sistema autĀ“onomo, y si es de conducciĀ“on aĀ“un mĀ“as. Las intersecciones son uno de los lugares dentro de las ciudades donde la seguridad es crĀ“ıtica. Los coches siguen trayectorias secantes y por tanto pueden colisionar, la mayorĀ“ıa de ellas son usadas por los peatones para atravesar la vĀ“ıa independientemente de si existen pasos de cebra o no, lo que incrementa de forma alarmante los riesgos de atropello y colisiĀ“on. La implementaciĀ“on de la combinaciĀ“on de ambos sistemas mejora substancialmente la comprensiĀ“on del entorno, y puede considerarse que incrementa la seguridad, allanando el camino en la investigaciĀ“on hacia un vehĀ“ıculo completamente autĀ“onomo.Road topology classification is a crucial point if we want to develop complete and safe autonomous driving systems. It is logical to think that a thorough understanding of the environment surrounding the ego-vehicle, as it happens when a human being is a decision-maker at the wheel, is an indispensable condition if we want to advance in the achievement of level 4 or 5 autonomous vehicles. If the driver, either an autonomous system or a human being, does not have access to the information of the environment, the decrease in safety is critical, and the accident is almost instantaneous, i.e., when a driver falls asleep at the wheel. Throughout this doctoral thesis, we present two deep learning systems that will help an autonomous driving system understand the environment in which it is at that instant. The first one, 3D-Deep and its optimization 3D-Deepest, is a new network architecture for semantic road segmentation in which data sources of different types are integrated. Road segmentation is vital in an autonomous vehicle since it is the medium on which it should drive in 99.9% of the cases. The second is an urban intersection classification system using different approaches comprised of metric-learning, temporal integration, and synthetic image generation. Safety is a crucial point in any autonomous system, and if it is a driving system, even more so. Intersections are one of the places within cities where safety is critical. Cars follow secant trajectories and therefore can collide; most of them are used by pedestrians to cross the road regardless of whether there are crosswalks or not, which alarmingly increases the risks of being hit and collision. The implementation of the combination of both systems substantially improves the understanding of the environment and can be considered to increase safety, paving the way in the research towards a fully autonomous vehicle

    Symbiotic deep learning for medical image analysis with applications in real-time diagnosis for fetal ultrasound screening

    Get PDF
    The last hundred years have seen a monumental rise in the power and capability of machines to perform intelligent tasks in the stead of previously human operators. This rise is not expected to slow down any time soon and what this means for society and humanity as a whole remains to be seen. The overwhelming notion is that with the right goals in mind, the growing influence of machines on our every day tasks will enable humanity to give more attention to the truly groundbreaking challenges that we all face together. This will usher in a new age of human machine collaboration in which humans and machines may work side by side to achieve greater heights for all of humanity. Intelligent systems are useful in isolation, but the true benefits of intelligent systems come to the fore in complex systems where the interaction between humans and machines can be made seamless, and it is this goal of symbiosis between human and machine that may democratise complex knowledge, which motivates this thesis. In the recent past, datadriven methods have come to the fore and now represent the state-of-the-art in many different fields. Alongside the shift from rule-based towards data-driven methods we have also seen a shift in how humans interact with these technologies. Human computer interaction is changing in response to data-driven methods and new techniques must be developed to enable the same symbiosis between man and machine for data-driven methods as for previous formula-driven technology. We address five key challenges which need to be overcome for data-driven human-in-the-loop computing to reach maturity. These are (1) the ā€™Categorisation Challengeā€™ where we examine existing work and form a taxonomy of the different methods being utilised for data-driven human-in-the-loop computing; (2) the ā€™Confidence Challengeā€™, where data-driven methods must communicate interpretable beliefs in how confident their predictions are; (3) the ā€™Complexity Challengeā€™ where the aim of reasoned communication becomes increasingly important as the complexity of tasks and methods to solve also increases; (4) the ā€™Classification Challengeā€™ in which we look at how complex methods can be separated in order to provide greater reasoning in complex classification tasks; and finally (5) the ā€™Curation Challengeā€™ where we challenge the assumptions around bottleneck creation for the development of supervised learning methods.Open Acces

    Autoencoding sensory substitution

    Get PDF
    Tens of millions of people live blind, and their number is ever increasing. Visual-to-auditory sensory substitution (SS) encompasses a family of cheap, generic solutions to assist the visually impaired by conveying visual information through sound. The required SS training is lengthy: months of effort is necessary to reach a practical level of adaptation. There are two reasons for the tedious training process: the elongated substituting audio signal, and the disregard for the compressive characteristics of the human hearing system. To overcome these obstacles, we developed a novel class of SS methods, by training deep recurrent autoencoders for image-to-sound conversion. We successfully trained deep learning models on different datasets to execute visual-to-auditory stimulus conversion. By constraining the visual space, we demonstrated the viability of shortened substituting audio signals, while proposing mechanisms, such as the integration of computational hearing models, to optimally convey visual features in the substituting stimulus as perceptually discernible auditory components. We tested our approach in two separate cases. In the first experiment, the author went blindfolded for 5 days, while performing SS training on hand posture discrimination. The second experiment assessed the accuracy of reaching movements towards objects on a table. In both test cases, above-chance-level accuracy was attained after a few hours of training. Our novel SS architecture broadens the horizon of rehabilitation methods engineered for the visually impaired. Further improvements on the proposed model shall yield hastened rehabilitation of the blind and a wider adaptation of SS devices as a consequence

    Task-specific and interpretable feature learning

    Get PDF
    Deep learning models have had tremendous impacts in recent years, while a question has been raised by many: Is deep learning just a triumph of empiricism? There has been emerging interest in reducing the gap between the theoretical soundness and interpretability, and the empirical success of deep models. This dissertation provides a comprehensive discussion on bridging traditional model-based learning approaches that emphasize problem-specific reasoning, and deep models that allow for larger learning capacity. The overall goal is to devise the next-generation feature learning architectures that are: 1) task-specific, namely, optimizing the entire pipeline from end to end while taking advantage of available prior knowledge and domain expertise; and 2) interpretable, namely, being able to learn a representation consisting of semantically sensible variables, and to display predictable behaviors. This dissertation starts by showing how the classical sparse coding models could be improved in a task-specific way, by formulating the entire pipeline as bi-level optimization. Then, it mainly illustrates how to incorporate the structure of classical learning models, e.g., sparse coding, into the design of deep architectures. A few concrete model examples are presented, ranging from the ā„“0\ell_0 and ā„“1\ell_1 sparse approximation models, to the ā„“āˆž\ell_\infty constrained model and the dual-sparsity model. The analytic tools in the optimization problems can be translated to guide the architecture design and performance analysis of deep models. As a result, those customized deep models demonstrate improved performance, intuitive interpretation, and efficient parameter initialization. On the other hand, deep networks are shown to be analogous to brain mechanisms. They exhibit the ability to describe semantic content from the primitive level to the abstract level. This dissertation thus also presents a preliminary investigation of the synergy between feature learning with cognitive science and neuroscience. Two novel application domains, image aesthetics assessment and brain encoding, are explored, with promising preliminary results achieved
    • ā€¦
    corecore