39 research outputs found
Exploring Generalisability of Self-Distillation with No Labels for SAR-Based Vegetation Prediction
In this work we pre-train a DINO-ViT based model using two Synthetic Aperture
Radar datasets (S1GRD or GSSIC) across three regions (China, Conus, Europe). We
fine-tune the models on smaller labeled datasets to predict vegetation
percentage, and empirically study the connection between the embedding space of
the models and their ability to generalize across diverse geographic regions
and to unseen data. For S1GRD, embedding spaces of different regions are
clearly separated, while GSSIC's overlaps. Positional patterns remain during
fine-tuning, and greater distances in embeddings often result in higher errors
for unfamiliar regions. With this, our work increases our understanding of
generalizability for self-supervised models applied to remote sensing.Comment: 10 pages, 9 figure
Fewshot learning on global multimodal embeddings for earth observation tasks
In this work we pretrain a CLIP/ViT based model using three different
modalities of satellite imagery across five AOIs covering over ~10\% of Earth's
total landmass, namely Sentinel 2 RGB optical imagery, Sentinel 1 SAR radar
amplitude and interferometric coherence. This model uses M
parameters. Then, we use the embeddings produced for each modality with a
classical machine learning method to attempt different downstream tasks for
earth observation related to vegetation, built up surface, croplands and
permanent water. We consistently show how we reduce the need for labeled data
by 99\%, so that with ~200-500 randomly selected labeled examples (around
4K-10K km) we reach performance levels analogous to those achieved with the
full labeled datasets (about 150K image chips or 3M km in each area of
interest - AOI) on all modalities, AOIs and downstream tasks. This leads us to
think that the model has captured significant earth features useful in a wide
variety of scenarios. To enhance our model's usability in practice, its
architecture allows inference in contexts with missing modalities and even
missing channels within each modality. Additionally, we visually show that this
embedding space, obtained with no labels, is sensible to the different earth
features represented by the labelled datasets we selected.Comment: 9 pages, 6 figures, presented on NeurIPS workshop on Robustness of
Few-shot and Zero-shot Learning in Foundation Model
Large Scale Masked Autoencoding for Reducing Label Requirements on SAR Data
Satellite-based remote sensing is instrumental in the monitoring and
mitigation of the effects of anthropogenic climate change. Large scale, high
resolution data derived from these sensors can be used to inform intervention
and policy decision making, but the timeliness and accuracy of these
interventions is limited by use of optical data, which cannot operate at night
and is affected by adverse weather conditions. Synthetic Aperture Radar (SAR)
offers a robust alternative to optical data, but its associated complexities
limit the scope of labelled data generation for traditional deep learning. In
this work, we apply a self-supervised pretraining scheme, masked autoencoding,
to SAR amplitude data covering 8.7\% of the Earth's land surface area, and tune
the pretrained weights on two downstream tasks crucial to monitoring climate
change - vegetation cover prediction and land cover classification. We show
that the use of this pretraining scheme reduces labelling requirements for the
downstream tasks by more than an order of magnitude, and that this pretraining
generalises geographically, with the performance gain increasing when tuned
downstream on regions outside the pretraining set. Our findings significantly
advance climate change mitigation by facilitating the development of task and
region-specific SAR models, allowing local communities and organizations to
deploy tailored solutions for rapid, accurate monitoring of climate change
effects.Comment: 12 pages, 6 figure
Exploring DINO: Emergent Properties and Limitations for Synthetic Aperture Radar Imagery
Self-supervised learning (SSL) models have recently demonstrated remarkable
performance across various tasks, including image segmentation. This study
delves into the emergent characteristics of the Self-Distillation with No
Labels (DINO) algorithm and its application to Synthetic Aperture Radar (SAR)
imagery. We pre-train a vision transformer (ViT)-based DINO model using
unlabeled SAR data, and later fine-tune the model to predict high-resolution
land cover maps. We rigorously evaluate the utility of attention maps generated
by the ViT backbone and compare them with the model's token embedding space. We
observe a small improvement in model performance with pre-training compared to
training from scratch and discuss the limitations and opportunities of SSL for
remote sensing and land cover segmentation. Beyond small performance increases,
we show that ViT attention maps hold great intrinsic value for remote sensing,
and could provide useful inputs to other algorithms. With this, our work lays
the groundwork for bigger and better SSL models for Earth Observation.Comment: 9 pages, 5 figure
Brain volumes quantification from MRI in healthy controls: Assessing correlation, agreement and robustness of a convolutional neural network-based software against FreeSurfer, CAT12 and FSL
Background and purpose: There are instances in which an estimate of the brain volume should be obtained from MRI in clinical practice. Our objective is to calculate cross-sectional robustness of a convolutional neural network (CNN) based software (Entelai Pic) for brain volume estimation and compare it to traditional software such as FreeSurfer, CAT12 and FSL in healthy controls (HC). Materials and Methods: Sixteen HC were scanned four times, two different days on two different MRI scanners (1.5 T and 3 T). Volumetric T1-weighted images were acquired and post-processed with FreeSurfer v6.0.0, Entelai Pic v2, CAT12 v12.5 and FSL v5.0.9. Whole-brain, grey matter (GM), white matter (WM) and cerebrospinal fluid (CSF) volumes were calculated. Correlation and agreement between methods was assessed using intraclass correlation coefficient (ICC) and Bland Altman plots. Robustness was assessed using the coefficient of variation (CV). Results: Whole-brain volume estimation had better correlation between FreeSurfer and Entelai Pic (ICC (95% CI) 0.96 (0.94−0.97)) than FreeSurfer and CAT12 (0.92 (0.88−0.96)) and FSL (0.87 (0.79−0.91)). WM, GM and CSF showed a similar trend. Compared to FreeSurfer, Entelai Pic provided similarly robust segmentations of brain volumes both on same-scanner (mean CV 1.07, range 0.20–3.13% vs. mean CV 1.05, range 0.21–3.20%, p = 0.86) and on different-scanner variables (mean CV 3.84, range 2.49–5.91% vs. mean CV 3.84, range 2.62–5.13%, p = 0.96). Mean post-processing times were 480, 5, 40 and 5 min for FreeSurfer, Entelai Pic, CAT12 and FSL respectively. Conclusion: Based on robustness and processing times, our CNN-based model is suitable for cross-sectional volumetry on clinical practice.Fil: Chaves, Hernan. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; ArgentinaFil: Dorr, Francisco. Entelai; ArgentinaFil: Costa, Martín Elías. Entelai; ArgentinaFil: Serra, María Mercedes. Entelai; Argentina. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; ArgentinaFil: Fernandez Slezak, Diego. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Entelai; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Farez, Mauricio Franco. Entelai; Argentina. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; ArgentinaFil: Sevlever, Gustavo. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Entelai; ArgentinaFil: Yañez, Paulina Celia. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentina. Universidad de Buenos Aires; ArgentinaFil: Cejas, Claudia. Fundación para la Lucha contra las Enfermedades Neurológicas de la Infancia; Argentin
COVID-19 pneumonia accurately detected on chest radiographs with artificial intelligence
PurposeTo investigate the diagnostic performance of an Artificial Intelligence (AI) system for detection of COVID-19 in chest radiographs (CXR), and compare results to those of physicians working alone, or with AI support.Materials and methodsAn AI system was fine-tuned to discriminate confirmed COVID-19 pneumonia, from other viral and bacterial pneumonia and non-pneumonia patients and used to review 302 CXR images from adult patients retrospectively sourced from nine different databases. Fifty-four physicians blind to diagnosis, were invited to interpret images under identical conditions in a test set, and randomly assigned either to receive or not receive support from the AI system. Comparisons were then made between diagnostic performance of physicians working with and without AI support. AI system performance was evaluated using the area under the receiver operating characteristic (AUROC), and sensitivity and specificity of physician performance compared to that of the AI system.ResultsDiscrimination by the AI system of COVID-19 pneumonia showed an AUROC curve of 0.96 in the validation and 0.83 in the external test set, respectively. The AI system outperformed physicians in the AUROC overall (70% increase in sensitivity and 1% increase in specificity, p < 0.0001). When working with AI support, physicians increased their diagnostic sensitivity from 47% to 61% (p < 0.001), although specificity decreased from 79% to 75% (p = 0.007).ConclusionsOur results suggest interpreting chest radiographs (CXR) supported by AI, increases physician diagnostic sensitivity for COVID-19 detection. This approach involving a human-machine partnership may help expedite triaging efforts and improve resource allocation in the current crisis.Fil: Dorr, Francisco. Entelai; ArgentinaFil: Chaves, Hernán. Entelai; Argentina. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; ArgentinaFil: Serra, María Mercedes. Entelai; Argentina. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; ArgentinaFil: Ramirez, Andres. Entelai; ArgentinaFil: Costa, Martín Elías. Entelai; ArgentinaFil: Seia, Joaquín Oscar. Entelai; ArgentinaFil: Cejas, Claudia. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; ArgentinaFil: Castro, Marcelo. Departamento de Diagnóstico por Imágenes, Clínica Indisa ; ChileFil: Eyheremendy, Eduardo. Hospital Alemán; ArgentinaFil: Fernández Slezak, Diego. Entelai; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Investigación en Ciencias de la Computación. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Investigación en Ciencias de la Computación; ArgentinaFil: Farez, Mauricio Franco. Entelai; Argentina. Fundación P/la Lucha C/enferm. neurológicas Infancia. Instituto de Neurociencias. - Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Neurociencias; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentin
Search for large extra dimensions in the production of jets and missing transverse energy in p(p)over-bar collisions at root s=1.96 TeV
We present the results of a search for new physics in the jets plus missing transverse energy data sample collected from 368 pb(-1) of p (p) over bar collisions at root s = 1.96 TeV recorded by the Collider Detector at Fermilab. We compare the number of events observed in the data with a data-based estimate of the standard model backgrounds contributing to this signature. We observe no significant excess of events, and we interpret this null result in terms of lower limits on the fundamental Planck scale for a large extra dimensions scenario
Measurement of the W+W- Production Cross Section in ppbar Collisions at sqrt(s)=1.96 TeV using Dilepton Events
We present a measurement of the W+W- production cross section using 184/pb of
ppbar collisions at a center-of-mass energy of 1.96 TeV collected with the
Collider Detector at Fermilab. Using the dilepton decay channel W+W- ->
l+l-vvbar, where the charged leptons can be either electrons or muons, we find
17 candidate events compared to an expected background of 5.0+2.2-0.8 events.
The resulting W+W- production cross section measurement of sigma(ppbar -> W+W-)
= 14.6 +5.8 -5.1 (stat) +1.8 -3.0 (syst) +-0.9 (lum) pb agrees well with the
Standard Model expectation.Comment: 8 pages, 2 figures, 2 tables. To be submitted to Physical Review
Letter
Measurement of the W+W- production cross section in p(p)over-bar collisions at root s=1.96 TeV using dilepton events
We present a measurement of the W+W- production cross section using 184 pb(-1) of p (p) over bar collisions at a center-of-mass energy of 1.96 TeV collected with the Collider Detector at Fermilab. Using the dilepton decay channel W+W-&RARR; l(+)ν l(-)(ν) over bar, where the charged leptons can be either electrons or muons, we find 17 candidate events compared to an expected background of 5.0(-0.8)(+2.2) events. The resulting W+W- production cross-section measurement of σ(p (p) over bar &RARR; W+W-)=14.6(-5.1)(+5.8)(stat)(-3.0)(+1.8)(syst)&PLUSMN; 0.9(lum) pb agrees well with the standard model expectation
Stroke genetics informs drug discovery and risk prediction across ancestries
Previous genome-wide association studies (GWASs) of stroke — the second leading cause of death worldwide — were conducted predominantly in populations of European ancestry1,2. Here, in cross-ancestry GWAS meta-analyses of 110,182 patients who have had a stroke (five ancestries, 33% non-European) and 1,503,898 control individuals, we identify association signals for stroke and its subtypes at 89 (61 new) independent loci: 60 in primary inverse-variance-weighted analyses and 29 in secondary meta-regression and multitrait analyses. On the basis of internal cross-ancestry validation and an independent follow-up in 89,084 additional cases of stroke (30% non-European) and 1,013,843 control individuals, 87% of the primary stroke risk loci and 60% of the secondary stroke risk loci were replicated (P < 0.05). Effect sizes were highly correlated across ancestries. Cross-ancestry fine-mapping, in silico mutagenesis analysis3, and transcriptome-wide and proteome-wide association analyses revealed putative causal genes (such as SH3PXD2A and FURIN) and variants (such as at GRK5 and NOS3). Using a three-pronged approach4, we provide genetic evidence for putative drug effects, highlighting F11, KLKB1, PROC, GP1BA, LAMC2 and VCAM1 as possible targets, with drugs already under investigation for stroke for F11 and PROC. A polygenic score integrating cross-ancestry and ancestry-specific stroke GWASs with vascular-risk factor GWASs (integrative polygenic scores) strongly predicted ischaemic stroke in populations of European, East Asian and African ancestry5. Stroke genetic risk scores were predictive of ischaemic stroke independent of clinical risk factors in 52,600 clinical-trial participants with cardiometabolic disease. Our results provide insights to inform biology, reveal potential drug targets and derive genetic risk prediction tools across ancestries