Search CORE

6 research outputs found

Deep semi-supervised segmentation with weight-averaged consistency targets

Author: BT Polyak
C Baur
C Olivier
F Prados
G Litjens
O Ronneberger
Y Ganin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Recently proposed techniques for semi-supervised learning such as Temporal Ensembling and Mean Teacher have achieved state-of-the-art results in many important classification benchmarks. In this work, we expand the Mean Teacher approach to segmentation tasks and show that it can bring important improvements in a realistic small data regime using a publicly available multi-center dataset from the Magnetic Resonance Imaging (MRI) domain. We also devise a method to solve the problems that arise when using traditional data augmentation strategies for segmentation tasks on our new training scheme.Comment: 8 pages, 1 figure, accepted for DLMIA/MICCA

arXiv.org e-Print Archive

Crossref

PolyPublie

Lidar–camera semi-supervised learning for semantic segmentation

Author: Bellone Mauro
Caltagirone Luca
Sell Raivo
Svensson Lennart
Wahde Mattias
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

In this work, we investigated two issues: (1) How the fusion of lidar and camera data can improve semantic segmentation performance compared with the individual sensor modalities in a supervised learning context; and (2) How fusion can also be leveraged for semi-supervised learning in order to further improve performance and to adapt to new domains without requiring any additional labelled data. A comparative study was carried out by providing an experimental evaluation on networks trained in different setups using various scenarios from sunny days to rainy night scenes. The networks were tested for challenging, and less common, scenarios where cameras or lidars individually would not provide a reliable prediction. Our results suggest that semi-supervised learning and fusion techniques increase the overall performance of the network in challenging scenarios using less data annotations

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central

Chalmers Research

Deep Learning Methods for MRI Spinal Cord Gray Matter Segmentation

Author: Samuel Perone Christian
Publication venue
Publication date: 01/03/2019
Field of study

La moelle épinière humaine, qui fait partie du système nerveux central, est la principale voie responsable de la connexion du cerveau et du système nerveux périphérique. On sait que la matière grise présente dans la moelle épinière est associée à de nombreux troubles neurologiques tels que la sclérose en plaques et la sclérose latérale amyotrophique. L’IRM est souvent utilisée pour étudier les maladies neurologiques et surveiller leur évolution. À cette fin, la morphométrie extraite de la substance grise de la moelle épinière, telle que le volume de la substance grise, peut être utilisée pour identifier et comprendre les modifications tissulaires associées aux troubles neurologiques comme ceux mentionnés précédemment. Pour extraire des mesures morphométriques de la matière grise de la moelle épinière, une annotation (label) par voxel est requise pour chaque tranche du volume IRM. L’annotation manuelle ne peut donc pas être facilement implémenté dans la pratique en raison non seulement des efforts fastidieux nécessaires pour annoter manuellement chaque tranche d’un volume d’IRM, mais aussi du désaccord et des biais introduits par différents annotateurs humains. Toutefois, il existe de nombreuses méthodes semi-automatiques ou entièrement automatiques pour annoter chaque voxel, mais la plupart d’entre elles sont composées d’approches en plusieurs étapes pouvant propager des erreurs dans le pipeline, s’appuient sur des dictionnaires de données ou ne généralisent pas bien lorsqu’il y a des changements anatomiques. Il est bien connu que les techniques modernes basées sur l’apprentissage par la représentation et l’apprentissage en profondeur ont obtenu d’excellents résultats dans un large éventail de tâches allant de la vision par ordinateur à l’imagerie médicale. Le programme de recherche de ce projet consiste à améliorer les résultats les plus récents des méthodes existantes au moyen de techniques modernes d’apprentissage en profondeur grâce à la conception, la mise en oeuvre et l’évaluation de ces méthodes pour la segmentation de la substance grise de la moelle épinière. Dans ce projet, trois techniques principales ont été développées: en open source, comme décrit ci-dessous. La première technique consistait à concevoir une architecture d’apprentissage en profondeur pour segmenter la matière grise de la moelle épinière et a permis d’obtenir de meilleures résultats comparé à six autres méthodes développées précédemment pour la segmentation de la matière grise. Cette technique a également permis de segmenter un volume ex vivo avec plus de 4000 tranches en fournissant au préalable et moins de 30 échantillons annotés du même volume. La deuxième technique a été développée pour tirer profitnon seulement des données anotées, mais aussides données qui ne le sont pas (données non anotées) au moyen d’une méthode d’apprentissage semi-supervisée étendue aux tâches de segmentation. Cette méthode a apporté des améliorations significatives dans un scénario réaliste sous un régime de données réduit en ajoutant des données non annotées au cours du processus de formation du modèle. La troisième technique développée est une méthode d’adaptation de domaine non supervisée pour la segmentation. Dans ce travail, nous avons abordé le problème du décalage de distribution présent sur les données IRM, qui est principalement causé par différents paramètres d’acquisition. Dans ce travail, nous avons montré qu’en adaptant le modèle à un domaine cible présenté au modèle sous forme de données non annotées, il est possible d’améliorer de manière significative la segmentation de la matière grise pour le domaine cible invisible. Conformément aux principes de la science ouverte pour tous (open science), nous avons ouvert toutes les méthodes sur des référentiels publics et en avons implémenté certaines sur la Spinal Cord Toolbox (SCT) 1, une bibliothèque complète et ouverte d’outils d’analyse pour l’IRM de la moelle épinière. Nous avons également utilisé uniquement des ensembles de données accessibles au public pour toutes les évaluations et la formation de modèles, ainsi que pour la publication de tous les articles sur les revues en libre accès, avec une disponibilité gratuite sur les serveurs d’archives pré-imprimées. Dans ce travail, nous avons pu constater que les modèles d’apprentissage en profondeur peuvent en effet fournir des progrès considérables par rapport aux méthodes précédemment développées. Les méthodes d’apprentissage en profondeur sont très flexibles et robustes. Elles permettent d’apprendre de bout en bout l’ensemble des pipelines de segmentation tout en permettant de tirer profit de données non annotées pour améliorer les performances du même domaine dans un scénario d’apprentissage semi-supervisé ou en tirant parti de données non étiquetées pour améliorer les performances des modèles dans des domaines cibles non vus. Il est également clair que l’apprentissage en profondeur n’est pas une panacée pour l’imagerie médicale. De nombreux problèmes demeurent en suspens, tels que le décalage de généralisation toujours présent lors de l’utilisation de ces modèles sur des domaines non vus. Un futur axe de recherche inclut le développement en cours de techniques pour éclairer les modèles d’apprentissage automatique avec paramétrisation d’acquisition IRM afin par exemple d’améliorer la généralisation du modèle à différents contrastes, ainsi que d’améliorer la variabilité inhérente de ces images due aux différentes machines et aux changements anatomiques. L’estimation de l’incertitude liée à la distillation des connaissances au cours des phases de formation des approches décrites dans ce travail constitue un autre domaine de recherche 1disponible à https://github.com/neuropoly/spinalcordtoolbox. potentiel. Cependant, les mesures d’incertitude font partie d’un domaine de recherche en cours d’évolution dans le Deep Learning. En effet la plupart des méthodes fournissant une approximation médiocre ou une sous-estimation de l’incertitude épistémique présente dans ces modèles. L’imagerie médicale reste un domaine très difficile pour les modèles d’apprentissage automatique en raison des fortes hypothèses d’identité distributionnelle formulées par les algorithmes d’apprentissage statistique ainsi que de la difficulté à incorporer de nouveaux biais inductifs dans ces modèles pour tirer parti de la symétrie, de l’invariance de rotation, entre autres. Néanmoins, avec la quantité croissante de données disponibles, elles offrent de grandes promesses et gagnent lentement en robustesse pour pouvoir entrer dans la pratique clinique.----------ABSTRACT The human spinal cord, part of the Central Nervous System (CNS), is the main pathway responsible for the connection of brain and peripheral nervous system. The gray matter present in the spinal cord is known to be associated with many neurological disorders such as multiple sclerosis and amyotrophic lateral sclerosis. Magnetic Resonance Imaging (MRI) is often used to study diseases and monitor the disease burden/progression during the course of the disease. To that goal, morphometrics extracted from the spinal cord gray matter such as gray matter volume can be used to identify and understand tissue changes that are associated with the aforementioned neurological disorders. To extract morphometrics from the spinal cord gray matter, a voxel-wise annotation is required for each slice of the MRI volume. Manual annotation becomes prohibitive in practice due to the time-consuming efforts required to manually annotate each slice of an MRI volume voxel-wise, not to mention the disagreement and bias introduced by different human annotators. Many semi-automatic or fully-automatic methods exist but most of them are composed by multi-stage approaches that can propagate errors in the pipeline, rely on data dictionaries, or doesn’t generalize well when there are anatomical changes. It is well-known that modern techniques based on representation learning and Deep Learning achieved excellent results in a wide range of tasks from computer vision and medical imaging as well. The research agenda of this project is to advance the state-of-the-art results of previous methods by means of modern Deep Learning techniques through the design, implementation, and evaluation of these methods for the spinal cord gray matter segmentation. In this project, three main techniques were developed an open-sourced, as described below. The first technique is the design of a Deep Learning architecture to segment the spinal cord gray matter that achieved state-of-the-art results when evaluated by a third-party system and compared to other 6 independently developed methods for gray matter segmentation. This technique also allowed to segment an ex vivo volume with more than 4000 slices by just providing less than 30 annotated samples from the same volume. The second technique was developed to take leverage not only of labeled data but also from unlabeled data by means of a semi-supervised learning method that was extended to segmentation tasks. This method achieved significant improvements in a realistic scenario under a small data regime by adding unlabeled data during the model training process. The third developed technique is an unsupervised domain adaptation method for segmentation. In this work, we addressed the problem of the distributional shift present on MRI data that is mostly caused by different acquisition parametrization. In this work, we showed that by adapting the model to a target domain, presented to the model as unlabeled data, it is possible to achieve significant improvements on the gray matter segmentation for the unseen target domain. Following the open science principles, we open-sourced all the methods on public repositories and implemented some of them on the Spinal Cord Toolbox (SCT) 2, a comprehensive and open-source library of analysis tools for MRI of the spinal cord. We also used only public available datasets for all evaluations and model training, and also published all articles on open-access journals with free availability on pre-print archive servers as well. In this work, we were able to see that Deep Learning models can indeed provide huge steps forward when compared to the previously developed methods. Deep Learning methods are very flexible and robust, allowing end-to-end learning of entire segmentation pipelines while being able to take leverage of unlabeled data to improve the performance for the same domain on a semi-supervised learning scenario, or by taking leverage of unlabeled data to improve the performance of models in unseen target domains. It is also clear that Deep Learning is not a panacea for medical imaging. Many problems remain open, such as the generalization gap that is still present when using these models on unseen domains. A future line of research includes the on-going development of techniques to inform machine learning models with MRI acquisition parametrization to improve the generalization of the model to different contrasts, to the inherent variability of these images due to different machine vendors and anatomical changes, to name a few. Another potential area of research is the uncertainty estimation for knowledge distillation during training phases of the approaches described in this work. However, uncertainty measures are still an open area of research in Deep Learning with most methods providing a poor approximation or under-estimation of the epistemic uncertainty present in these models. Medical imaging is still a very challenging field for machine learning models due to the strong assumptions of distributional identity made by statistical learning algorithms as well as the difficulty to incorporate new inductive biases into these models to take leverage of symmetry, rotation invariance, among others. Nevertheless, with the amount of data availability growing, they show great promises and are slowly gaining robustness enough to be able to enter in clinical practice

PolyPublie

Recommended from our members

Musical source separation with deep learning and large-scale datasets

Author: Jansson A.
Publication venue
Publication date
Field of study

Throughout this thesis we will explore automatic music source separation by utilizing modern (at the time of writing) techniques and tools from machine learning and big data processing. The bulk of this work was carried out between 2016 and 2019. In Chapter 2 we conduct a review of source separation literature. We start by outlining a subset of applications of source separation in some depth. We describe some of the early, pioneering work in automatic source separation: Auditory Scene Analysis, and its digital counterpart, Computational Auditory Scene Analysis. We then introduce matrix decomposition-based methods such as Independent Component Analysis and Non-Negative Matrix factorization, and pitch informed methods where the separation algorithm is guided by pitch information that is known a priori. We brie y discuss user-guided methods, before conducting a thorough review of Deep Learning based source separation, including recurrent, convolutional, deep clustering-based, and Generative Adversarial Networks. We then proceed to describe common evaluation metrics and training datasets. Finally, we list a number of current challenges and drawbacks of current systems. Chapter 3 focuses on datasets for musical source separation. First we show the growth of dataset sizes for both machine learning in general and music information retrieval specifically. We give several examples of the complexities and idiosyncrasies that are intrinsic to music datasets. We then proceed to present a method for extracting ground truth data for source separation from large unstructured musical catalogs. In Chapter 4 we design a novel deep learning-based source separation algorithm. Motivation is provided by means of a musicological study1 that showed the high importance of vocals relative to other musical factors, in the minds of listeners. At the core of the vocal separation algorithm is the U-Net, a deep learning architecture that uses skip connections to preserve fine-grained detail. It was originally developed in the biomedical imaging domain, and later adapted to image-to-image translation. We adapt it to the source separation domain by treating spectrograms as images, and we use the dataset mining methods from Chapter 3 to generate sufficiently large training data. We evaluate our model objectively using standard evaluation metrics, subjectively using \crowdsourced" human subjects. To the best of our knowledge, this is the first use of U-Nets for source separation. In the introduction above we proposed joint learning to optimize source separation and other objectives. In Chapter 5 we investigate one such instance: multi-task learning of vocal removal and vocal pitch tracking. We combine the vocal separation model from Chapter 4 with a state of the art pitch salience estimation model2, exploring several ways of combining the two models. We find that vocal pitch estimation benefits from joint learning when the two tasks are trained in sequence, with the source separation model preceding the pitch estimation model. We also report benefits from fine-tuning by iteratively applying the model. Chapter 6 extends the U-Net model to multiple instruments. In order to minimize the phase artifacts that were a common issue in Chapter 4, we modify the model to operate in the complex domain. We run experiments with several loss functions: Time-domain loss, magnitude-only frequency domain loss, and joint time and frequency-domain loss. Our experiments are evaluated both objectively and subjectively, and we carry out extensive qualitative analysis to investigate the effects of complex masking. Finally, we conclude the thesis in Chapter 7 by summarizing this work and highlighting several future directions of research

City Research Online