39 research outputs found

    Exploring Generalisability of Self-Distillation with No Labels for SAR-Based Vegetation Prediction

    Full text link
    In this work we pre-train a DINO-ViT based model using two Synthetic Aperture Radar datasets (S1GRD or GSSIC) across three regions (China, Conus, Europe). We fine-tune the models on smaller labeled datasets to predict vegetation percentage, and empirically study the connection between the embedding space of the models and their ability to generalize across diverse geographic regions and to unseen data. For S1GRD, embedding spaces of different regions are clearly separated, while GSSIC's overlaps. Positional patterns remain during fine-tuning, and greater distances in embeddings often result in higher errors for unfamiliar regions. With this, our work increases our understanding of generalizability for self-supervised models applied to remote sensing.Comment: 10 pages, 9 figure

    Fewshot learning on global multimodal embeddings for earth observation tasks

    Full text link
    In this work we pretrain a CLIP/ViT based model using three different modalities of satellite imagery across five AOIs covering over ~10\% of Earth's total landmass, namely Sentinel 2 RGB optical imagery, Sentinel 1 SAR radar amplitude and interferometric coherence. This model uses 250\sim 250 M parameters. Then, we use the embeddings produced for each modality with a classical machine learning method to attempt different downstream tasks for earth observation related to vegetation, built up surface, croplands and permanent water. We consistently show how we reduce the need for labeled data by 99\%, so that with ~200-500 randomly selected labeled examples (around 4K-10K km2^2) we reach performance levels analogous to those achieved with the full labeled datasets (about 150K image chips or 3M km2^2 in each area of interest - AOI) on all modalities, AOIs and downstream tasks. This leads us to think that the model has captured significant earth features useful in a wide variety of scenarios. To enhance our model's usability in practice, its architecture allows inference in contexts with missing modalities and even missing channels within each modality. Additionally, we visually show that this embedding space, obtained with no labels, is sensible to the different earth features represented by the labelled datasets we selected.Comment: 9 pages, 6 figures, presented on NeurIPS workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Model

    Large Scale Masked Autoencoding for Reducing Label Requirements on SAR Data

    Full text link
    Satellite-based remote sensing is instrumental in the monitoring and mitigation of the effects of anthropogenic climate change. Large scale, high resolution data derived from these sensors can be used to inform intervention and policy decision making, but the timeliness and accuracy of these interventions is limited by use of optical data, which cannot operate at night and is affected by adverse weather conditions. Synthetic Aperture Radar (SAR) offers a robust alternative to optical data, but its associated complexities limit the scope of labelled data generation for traditional deep learning. In this work, we apply a self-supervised pretraining scheme, masked autoencoding, to SAR amplitude data covering 8.7\% of the Earth's land surface area, and tune the pretrained weights on two downstream tasks crucial to monitoring climate change - vegetation cover prediction and land cover classification. We show that the use of this pretraining scheme reduces labelling requirements for the downstream tasks by more than an order of magnitude, and that this pretraining generalises geographically, with the performance gain increasing when tuned downstream on regions outside the pretraining set. Our findings significantly advance climate change mitigation by facilitating the development of task and region-specific SAR models, allowing local communities and organizations to deploy tailored solutions for rapid, accurate monitoring of climate change effects.Comment: 12 pages, 6 figure

    Exploring DINO: Emergent Properties and Limitations for Synthetic Aperture Radar Imagery

    Full text link
    Self-supervised learning (SSL) models have recently demonstrated remarkable performance across various tasks, including image segmentation. This study delves into the emergent characteristics of the Self-Distillation with No Labels (DINO) algorithm and its application to Synthetic Aperture Radar (SAR) imagery. We pre-train a vision transformer (ViT)-based DINO model using unlabeled SAR data, and later fine-tune the model to predict high-resolution land cover maps. We rigorously evaluate the utility of attention maps generated by the ViT backbone and compare them with the model's token embedding space. We observe a small improvement in model performance with pre-training compared to training from scratch and discuss the limitations and opportunities of SSL for remote sensing and land cover segmentation. Beyond small performance increases, we show that ViT attention maps hold great intrinsic value for remote sensing, and could provide useful inputs to other algorithms. With this, our work lays the groundwork for bigger and better SSL models for Earth Observation.Comment: 9 pages, 5 figure

    Brain volumes quantification from MRI in healthy controls: Assessing correlation, agreement and robustness of a convolutional neural network-based software against FreeSurfer, CAT12 and FSL

    Get PDF
    Background and purpose: There are instances in which an estimate of the brain volume should be obtained from MRI in clinical practice. Our objective is to calculate cross-sectional robustness of a convolutional neural network (CNN) based software (Entelai Pic) for brain volume estimation and compare it to traditional software such as FreeSurfer, CAT12 and FSL in healthy controls (HC). Materials and Methods: Sixteen HC were scanned four times, two different days on two different MRI scanners (1.5 T and 3 T). Volumetric T1-weighted images were acquired and post-processed with FreeSurfer v6.0.0, Entelai Pic v2, CAT12 v12.5 and FSL v5.0.9. Whole-brain, grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF) volumes were calculated. Correlation and agreement between methods was assessed using intraclass correlation coefficient (ICC) and Bland Altman plots. Robustness was assessed using the coefficient of variation (CV). Results: Whole-brain volume estimation had better correlation between FreeSurfer and Entelai Pic (ICC (95% CI) 0.96 (0.94−0.97)) than FreeSurfer and CAT12 (0.92 (0.88−0.96)) and FSL (0.87 (0.79−0.91)). WM, GM and CSF showed a similar trend. Compared to FreeSurfer, Entelai Pic provided similarly robust segmentations of brain volumes both on same-scanner (mean CV 1.07, range 0.20–3.13% vs. mean CV 1.05, range 0.21–3.20%, p = 0.86) and on different-scanner variables (mean CV 3.84, range 2.49–5.91% vs. mean CV 3.84, range 2.62–5.13%, p = 0.96). Mean post-processing times were 480, 5, 40 and 5 min for FreeSurfer, Entelai Pic, CAT12 and FSL respectively. Conclusion: Based on robustness and processing times, our CNN-based model is suitable for cross-sectional volumetry on clinical practice.Fil: Chaves, Hernan. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; ArgentinaFil: Dorr, Francisco. Entelai; ArgentinaFil: Costa, Martín Elías. Entelai; ArgentinaFil: Serra, María Mercedes. Entelai; Argentina. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; ArgentinaFil: Fernandez Slezak, Diego. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Entelai; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Farez, Mauricio Franco. Entelai; Argentina. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Sevlever, Gustavo. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Entelai; ArgentinaFil: Yañez, Paulina Celia. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Universidad de Buenos Aires; ArgentinaFil: Cejas, Claudia. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentin

    COVID-19 pneumonia accurately detected on chest radiographs with artificial intelligence

    Get PDF
    PurposeTo investigate the diagnostic performance of an Artificial Intelligence (AI) system for detection of COVID-19 in chest radiographs (CXR), and compare results to those of physicians working alone, or with AI support.Materials and methodsAn AI system was fine-tuned to discriminate confirmed COVID-19 pneumonia, from other viral and bacterial pneumonia and non-pneumonia patients and used to review 302 CXR images from adult patients retrospectively sourced from nine different databases. Fifty-four physicians blind to diagnosis, were invited to interpret images under identical conditions in a test set, and randomly assigned either to receive or not receive support from the AI system. Comparisons were then made between diagnostic performance of physicians working with and without AI support. AI system performance was evaluated using the area under the receiver operating characteristic (AUROC), and sensitivity and specificity of physician performance compared to that of the AI system.ResultsDiscrimination by the AI system of COVID-19 pneumonia showed an AUROC curve of 0.96 in the validation and 0.83 in the external test set, respectively. The AI system outperformed physicians in the AUROC overall (70% increase in sensitivity and 1% increase in specificity, p < 0.0001). When working with AI support, physicians increased their diagnostic sensitivity from 47% to 61% (p < 0.001), although specificity decreased from 79% to 75% (p = 0.007).ConclusionsOur results suggest interpreting chest radiographs (CXR) supported by AI, increases physician diagnostic sensitivity for COVID-19 detection. This approach involving a human-machine partnership may help expedite triaging efforts and improve resource allocation in the current crisis.Fil: Dorr, Francisco. Entelai; ArgentinaFil: Chaves, Hernán. Entelai; Argentina. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; ArgentinaFil: Serra, María Mercedes. Entelai; Argentina. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; ArgentinaFil: Ramirez, Andres. Entelai; ArgentinaFil: Costa, Martín Elías. Entelai; ArgentinaFil: Seia, Joaquín Oscar. Entelai; ArgentinaFil: Cejas, Claudia. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; ArgentinaFil: Castro, Marcelo. Departamento de Diagnóstico por Imágenes, Clínica Indisa ; ChileFil: Eyheremendy, Eduardo. Hospital Alemán; ArgentinaFil: Fernández Slezak, Diego. Entelai; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Farez, Mauricio Franco. Entelai; Argentina. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin

    Search for large extra dimensions in the production of jets and missing transverse energy in p(p)over-bar collisions at root s=1.96 TeV

    Get PDF
    We present the results of a search for new physics in the jets plus missing transverse energy data sample collected from 368 pb(-1) of p (p) over bar collisions at root s = 1.96 TeV recorded by the Collider Detector at Fermilab. We compare the number of events observed in the data with a data-based estimate of the standard model backgrounds contributing to this signature. We observe no significant excess of events, and we interpret this null result in terms of lower limits on the fundamental Planck scale for a large extra dimensions scenario

    Measurement of the W+W- Production Cross Section in ppbar Collisions at sqrt(s)=1.96 TeV using Dilepton Events

    Get PDF
    We present a measurement of the W+W- production cross section using 184/pb of ppbar collisions at a center-of-mass energy of 1.96 TeV collected with the Collider Detector at Fermilab. Using the dilepton decay channel W+W- -> l+l-vvbar, where the charged leptons can be either electrons or muons, we find 17 candidate events compared to an expected background of 5.0+2.2-0.8 events. The resulting W+W- production cross section measurement of sigma(ppbar -> W+W-) = 14.6 +5.8 -5.1 (stat) +1.8 -3.0 (syst) +-0.9 (lum) pb agrees well with the Standard Model expectation.Comment: 8 pages, 2 figures, 2 tables. To be submitted to Physical Review Letter

    Measurement of the W+W- production cross section in p(p)over-bar collisions at root s=1.96 TeV using dilepton events

    Get PDF
    We present a measurement of the W+W- production cross section using 184 pb(-1) of p (p) over bar collisions at a center-of-mass energy of 1.96 TeV collected with the Collider Detector at Fermilab. Using the dilepton decay channel W+W-&RARR; l(+)ν l(-)(ν) over bar, where the charged leptons can be either electrons or muons, we find 17 candidate events compared to an expected background of 5.0(-0.8)(+2.2) events. The resulting W+W- production cross-section measurement of σ(p (p) over bar &RARR; W+W-)=14.6(-5.1)(+5.8)(stat)(-3.0)(+1.8)(syst)&PLUSMN; 0.9(lum) pb agrees well with the standard model expectation

    Stroke genetics informs drug discovery and risk prediction across ancestries

    Get PDF
    Previous genome-wide association studies (GWASs) of stroke — the second leading cause of death worldwide — were conducted predominantly in populations of European ancestry1,2. Here, in cross-ancestry GWAS meta-analyses of 110,182 patients who have had a stroke (five ancestries, 33% non-European) and 1,503,898 control individuals, we identify association signals for stroke and its subtypes at 89 (61 new) independent loci: 60 in primary inverse-variance-weighted analyses and 29 in secondary meta-regression and multitrait analyses. On the basis of internal cross-ancestry validation and an independent follow-up in 89,084 additional cases of stroke (30% non-European) and 1,013,843 control individuals, 87% of the primary stroke risk loci and 60% of the secondary stroke risk loci were replicated (P < 0.05). Effect sizes were highly correlated across ancestries. Cross-ancestry fine-mapping, in silico mutagenesis analysis3, and transcriptome-wide and proteome-wide association analyses revealed putative causal genes (such as SH3PXD2A and FURIN) and variants (such as at GRK5 and NOS3). Using a three-pronged approach4, we provide genetic evidence for putative drug effects, highlighting F11, KLKB1, PROC, GP1BA, LAMC2 and VCAM1 as possible targets, with drugs already under investigation for stroke for F11 and PROC. A polygenic score integrating cross-ancestry and ancestry-specific stroke GWASs with vascular-risk factor GWASs (integrative polygenic scores) strongly predicted ischaemic stroke in populations of European, East Asian and African ancestry5. Stroke genetic risk scores were predictive of ischaemic stroke independent of clinical risk factors in 52,600 clinical-trial participants with cardiometabolic disease. Our results provide insights to inform biology, reveal potential drug targets and derive genetic risk prediction tools across ancestries
    corecore