15 research outputs found

    Integrating Prior Knowledge in Contrastive Learning with Kernel

    Get PDF

    EnD: Entangling and Disentangling deep representations for bias correction

    Get PDF
    Artificial neural networks perform state-of-the-art in an ever-growing number of tasks, and nowadays they are used to solve an incredibly large variety of tasks. There are problems, like the presence of biases in the training data, which question the generalization capability of these models. In this work we propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases. In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias, still letting the useful information for the training task forward-propagating in the rest of the model. One big advantage of EnD is that we do not require additional training complexity (like decoders or extra layers in the model), since it is a regularizer directly applied on the trained model. Our experiments show that EnD effectively improves the generalization on unbiased test sets, and it can be effectively applied on real-case scenarios, like removing hidden biases in the COVID-19 detection from radiographic images

    Unveiling COVID-19 from Chest X-ray with deep learning: a hurdles race with small data

    Full text link
    The possibility to use widespread and simple chest X-ray (CXR) imaging for early screening of COVID-19 patients is attracting much interest from both the clinical and the AI community. In this study we provide insights and also raise warnings on what is reasonable to expect by applying deep-learning to COVID classification of CXR images. We provide a methodological guide and critical reading of an extensive set of statistical results that can be obtained using currently available datasets. In particular, we take the challenge posed by current small size COVID data and show how significant can be the bias introduced by transfer-learning using larger public non-COVID CXR datasets. We also contribute by providing results on a medium size COVID CXR dataset, just collected by one of the major emergency hospitals in Northern Italy during the peak of the COVID pandemic. These novel data allow us to contribute to validate the generalization capacity of preliminary results circulating in the scientific community. Our conclusions shed some light into the possibility to effectively discriminate COVID using CXR

    Unbiased Supervised Contrastive Learning

    Get PDF
    Many datasets are biased, namely they contain easy-to-learn features that are highly correlated with the target class only in the dataset but not in the true underlying distribution of the data. For this reason, learning unbiased models from biased data has become a very relevant research topic in the last years. In this work, we tackle the problem of learning representations that are robust to biases. We first present a margin-based theoretical framework that allows us to clarify why recent contrastive losses (InfoNCE, SupCon, etc.) can fail when dealing with biased data. Based on that, we derive a novel formulation of the supervised contrastive loss (epsilon-SupInfoNCE), providing more accurate control of the minimal distance between positive and negative samples. Furthermore, thanks to our theoretical framework, we also propose FairKL, a new debiasing regularization loss, that works well even with extremely biased data. We validate the proposed losses on standard vision datasets including CIFAR10, CIFAR100, and ImageNet, and we assess the debiasing capability of FairKL with epsilon-SupInfoNCE, reaching state-of-the-art performance on a number of biased datasets, including real instances of biases in the wild.Comment: Accepted at ICLR 202

    Detection of subclinical atherosclerosis by image-based deep learning on chest x-ray

    Full text link
    Aims. To develop a deep-learning based system for recognition of subclinical atherosclerosis on a plain frontal chest x-ray. Methods and Results. A deep-learning algorithm to predict coronary artery calcium (CAC) score (the AI-CAC model) was developed on 460 chest x-ray (80% training cohort, 20% internal validation cohort) of primary prevention patients (58.4% male, median age 63 [51-74] years) with available paired chest x-ray and chest computed tomography (CT) indicated for any clinical reason and performed within 3 months. The CAC score calculated on chest CT was used as ground truth. The model was validated on an temporally-independent cohort of 90 patients from the same institution (external validation). The diagnostic accuracy of the AI-CAC model assessed by the area under the curve (AUC) was the primary outcome. Overall, median AI-CAC score was 35 (0-388) and 28.9% patients had no AI-CAC. AUC of the AI-CAC model to identify a CAC>0 was 0.90 in the internal validation cohort and 0.77 in the external validation cohort. Sensitivity was consistently above 92% in both cohorts. In the overall cohort (n=540), among patients with AI-CAC=0, a single ASCVD event occurred, after 4.3 years. Patients with AI-CAC>0 had significantly higher Kaplan Meier estimates for ASCVD events (13.5% vs. 3.4%, log-rank=0.013). Conclusion. The AI-CAC model seems to accurately detect subclinical atherosclerosis on chest x-ray with elevated sensitivity, and to predict ASCVD events with elevated negative predictive value. Adoption of the AI-CAC model to refine CV risk stratification or as an opportunistic screening tool requires prospective evaluation.Comment: Submitted to European Heart Journal - Cardiovascular Imaging Added also the additional material 44 pages (30 main paper, 14 additional material), 14 figures (5 main manuscript, 9 additional material

    Apprentissage sans collatéral des représentations profondes : Des images naturelles aux applications biomédicales

    No full text
    Deep Learning (DL) has become one of the predominant tools for solving a variety of tasks, often with superior performance compared to previous state-of-the-art methods. DL models are often able to learn meaningful and abstract representations of the underlying data. However, it has been shown that they might also learn additional features, which are not necessarily relevant or required for the desired task. This could pose a number of issues, as this additional information can contain bias, noise, or sensitive information, that should not be taken into account (e.g. gender, race, age, etc.) by the model. We refer to this information as collateral. The presence of collateral information translates into practical issues when deploying DL-based pipelines, especially if they involve private users' data. Learning robust representations that are free of collateral information can be highly relevant for a variety of fields and applications, like medical applications and decision support systems.In this thesis, we introduce the concept of Collateral Learning, which refers to all those instances in which a model learns more information than intended. The aim of Collateral Learning is to bridge the gap between different fields in DL, such as robustness, debiasing, generalization in medical imaging, and privacy preservation. We propose different methods for achieving robust representations free of collateral information. Some of our contributions are based on regularization techniques, while others are represented by novel loss functions.In the first part of the thesis, we lay the foundations of our work, by developing techniques for robust representation learning on natural images. We focus on one of the most important instances of Collateral Learning, namely biased data. Specifically, we focus on Contrastive Learning (CL), and we propose a unified metric learning framework that allows us to both easily analyze existing loss functions, and derive novel ones. Here, we propose a novel supervised contrastive loss function, ε-SupInfoNCE, and two debiasing regularization techniques, EnD and FairKL, that achieve state-of-the-art performance on a number of standard vision classification and debiasing benchmarks.In the second part of the thesis, we focus on Collateral Learning in medical imaging, specifically on neuroimaging and chest X-ray images. For neuroimaging, we present a novel contrastive learning approach for brain age estimation. Our approach achieves state-of-the-art results on the OpenBHB dataset for age regression and shows increased robustness to the site effect. We also leverage this method to detect unhealthy brain aging patterns, showing promising results in the classification of brain conditions such as Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD). For chest X-ray images (CXR), we will target Covid-19 classification, showing how Collateral Learning can effectively hinder the reliability of such models. To tackle such issue, we propose a transfer learning approach that, combined with our regularization techniques, shows promising results on an original multi-site CXRs dataset.Finally, we provide some hints about Collateral Learning and privacy preservation in DL models. We show that some of our proposed methods can be effective in preventing certain information from being learned by the model, thus avoiding potential data leakage.L’apprentissage profond est devenu l'un des outils prédominants pour résoudre une variété de tâches, souvent avec des performances supérieures à celles des méthodes précédentes. Les modèles d'apprentissage profond sont souvent capables d'apprendre des représentations significatives et abstraites des données sous-jacentes. Toutefois, il a été démontré qu'ils pouvaient également apprendre des caractéristiques supplémentaires, qui ne sont pas nécessairement pertinentes ou nécessaires pour la tâche souhaitée. Cela peut poser un certain nombre de problèmes, car ces informations supplémentaires peuvent contenir des biais, du bruit ou des informations sensibles qui ne devraient pas être prises en compte (comme le sexe, la race, l'âge, etc.) par le modèle. Nous appelons ces informations "collatérales". La présence d'informations collatérales se traduit par des problèmes pratiques, en particulier lorsqu'il s'agit de données d'utilisateurs privés. L'apprentissage de représentations robustes exemptes d'informations collatérales peut être utile dans divers domaines, tels que les applications médicales et les systèmes d'aide à la décision.Dans cette thèse, nous introduisons le concept d'apprentissage collatéral, qui se réfère à tous les cas où un modèle apprend plus d'informations que prévu. L'objectif de l'apprentissage collatéral est de combler le fossé entre différents domaines, tels que la robustesse, le débiaisage, la généralisation en imagerie médicale et la préservation de la vie privée. Nous proposons différentes méthodes pour obtenir des représentations robustes exemptes d'informations collatérales. Certaines de nos contributions sont basées sur des techniques de régularisation, tandis que d'autres sont représentées par de nouvelles fonctions de perte.Dans la première partie de la thèse, nous posons les bases de notre travail, en développant des techniques pour l'apprentissage de représentations robustes sur des images naturelles, en se concentrant sur les données biaisées.Plus précisément, nous nous concentrons sur l'apprentissage contrastif (CL) et nous proposons un cadre d'apprentissage métrique unifié qui nous permet à la fois d'analyser facilement les fonctions de perte existantes et d'en dériver de nouvelles.Nous proposons ici une nouvelle fonction de perte contrastive supervisée, ε-SupInfoNCE, et deux techniques de régularisation de débiaisage, EnD et FairKL, qui atteignent des performances de pointe sur un certain nombre de repères de classification et de débiaisage de vision standard.Dans la deuxième partie de la thèse, nous nous concentrons sur l'apprentissage collatéral sur les images de neuro-imagerie et de radiographie thoracique. Pour la neuro-imagerie, nous présentons une nouvelle approche d'apprentissage contrastif pour l'estimation de l'âge du cerveau. Notre approche atteint des résultats de pointe sur l'ensemble de données OpenBHB pour la régression de l'âge et montre une robustesse accrue à l'effet de site. Nous tirons également parti de cette méthode pour détecter des modèles de vieillissement cérébral malsains, ce qui donne des résultats prometteurs dans la classification d'affections cérébrales telles que les troubles cognitifs légers (MCI) et la maladie d'Alzheimer (AD). Pour les images de radiographie thoracique (CXR), nous ciblerons la classification Covid-19, en montrant comment l'apprentissage collatéral peut effectivement nuire à la fiabilité de ces modèles. Pour résoudre ce problème, nous proposons une approche d'apprentissage par transfert qui, combinée à nos techniques de régularisation, donne des résultats prometteurs sur un ensemble de données CXR multisites.Enfin, nous donnons quelques indications sur l'apprentissage collatéral et la préservation de la vie privée dans les modèles DL. Nous montrons que certaines des méthodes que nous proposons peuvent être efficaces pour empêcher que certaines informations soient apprises par le modèle, évitant ainsi une fuite potentielle de données
    corecore