42 research outputs found

    Exploring Generalisability of Self-Distillation with No Labels for SAR-Based Vegetation Prediction

    Full text link
    In this work we pre-train a DINO-ViT based model using two Synthetic Aperture Radar datasets (S1GRD or GSSIC) across three regions (China, Conus, Europe). We fine-tune the models on smaller labeled datasets to predict vegetation percentage, and empirically study the connection between the embedding space of the models and their ability to generalize across diverse geographic regions and to unseen data. For S1GRD, embedding spaces of different regions are clearly separated, while GSSIC's overlaps. Positional patterns remain during fine-tuning, and greater distances in embeddings often result in higher errors for unfamiliar regions. With this, our work increases our understanding of generalizability for self-supervised models applied to remote sensing.Comment: 10 pages, 9 figure

    Fewshot learning on global multimodal embeddings for earth observation tasks

    Full text link
    In this work we pretrain a CLIP/ViT based model using three different modalities of satellite imagery across five AOIs covering over ~10\% of Earth's total landmass, namely Sentinel 2 RGB optical imagery, Sentinel 1 SAR radar amplitude and interferometric coherence. This model uses ∼250\sim 250 M parameters. Then, we use the embeddings produced for each modality with a classical machine learning method to attempt different downstream tasks for earth observation related to vegetation, built up surface, croplands and permanent water. We consistently show how we reduce the need for labeled data by 99\%, so that with ~200-500 randomly selected labeled examples (around 4K-10K km2^2) we reach performance levels analogous to those achieved with the full labeled datasets (about 150K image chips or 3M km2^2 in each area of interest - AOI) on all modalities, AOIs and downstream tasks. This leads us to think that the model has captured significant earth features useful in a wide variety of scenarios. To enhance our model's usability in practice, its architecture allows inference in contexts with missing modalities and even missing channels within each modality. Additionally, we visually show that this embedding space, obtained with no labels, is sensible to the different earth features represented by the labelled datasets we selected.Comment: 9 pages, 6 figures, presented on NeurIPS workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Model

    Large Scale Masked Autoencoding for Reducing Label Requirements on SAR Data

    Full text link
    Satellite-based remote sensing is instrumental in the monitoring and mitigation of the effects of anthropogenic climate change. Large scale, high resolution data derived from these sensors can be used to inform intervention and policy decision making, but the timeliness and accuracy of these interventions is limited by use of optical data, which cannot operate at night and is affected by adverse weather conditions. Synthetic Aperture Radar (SAR) offers a robust alternative to optical data, but its associated complexities limit the scope of labelled data generation for traditional deep learning. In this work, we apply a self-supervised pretraining scheme, masked autoencoding, to SAR amplitude data covering 8.7\% of the Earth's land surface area, and tune the pretrained weights on two downstream tasks crucial to monitoring climate change - vegetation cover prediction and land cover classification. We show that the use of this pretraining scheme reduces labelling requirements for the downstream tasks by more than an order of magnitude, and that this pretraining generalises geographically, with the performance gain increasing when tuned downstream on regions outside the pretraining set. Our findings significantly advance climate change mitigation by facilitating the development of task and region-specific SAR models, allowing local communities and organizations to deploy tailored solutions for rapid, accurate monitoring of climate change effects.Comment: 12 pages, 6 figure

    Exploring DINO: Emergent Properties and Limitations for Synthetic Aperture Radar Imagery

    Full text link
    Self-supervised learning (SSL) models have recently demonstrated remarkable performance across various tasks, including image segmentation. This study delves into the emergent characteristics of the Self-Distillation with No Labels (DINO) algorithm and its application to Synthetic Aperture Radar (SAR) imagery. We pre-train a vision transformer (ViT)-based DINO model using unlabeled SAR data, and later fine-tune the model to predict high-resolution land cover maps. We rigorously evaluate the utility of attention maps generated by the ViT backbone and compare them with the model's token embedding space. We observe a small improvement in model performance with pre-training compared to training from scratch and discuss the limitations and opportunities of SSL for remote sensing and land cover segmentation. Beyond small performance increases, we show that ViT attention maps hold great intrinsic value for remote sensing, and could provide useful inputs to other algorithms. With this, our work lays the groundwork for bigger and better SSL models for Earth Observation.Comment: 9 pages, 5 figure

    Probabilistic Super-Resolution of Solar Magnetograms: Generating Many Explanations and Measuring Uncertainties

    Get PDF
    Machine learning techniques have been successfully applied to super-resolution tasks on natural images where visually pleasing results are sufficient. However in many scientific domains this is not adequate and estimations of errors and uncertainties are crucial. To address this issue we propose a Bayesian framework that decomposes uncertainties into epistemic and aleatoric uncertainties. We test the validity of our approach by super-resolving images of the Sun's magnetic field and by generating maps measuring the range of possible high resolution explanations compatible with a given low resolution magnetogram

    Single-Frame Super-Resolution of Solar Magnetograms: Investigating Physics-Based Metrics \& Losses

    Get PDF
    Breakthroughs in our understanding of physical phenomena have traditionally followed improvements in instrumentation. Studies of the magnetic field of the Sun, and its influence on the solar dynamo and space weather events, have benefited from improvements in resolution and measurement frequency of new instruments. However, in order to fully understand the solar cycle, high-quality data across time-scales longer than the typical lifespan of a solar instrument are required. At the moment, discrepancies between measurement surveys prevent the combined use of all available data. In this work, we show that machine learning can help bridge the gap between measurement surveys by learning to \textbf{super-resolve} low-resolution magnetic field images and \textbf{translate} between characteristics of contemporary instruments in orbit. We also introduce the notion of physics-based metrics and losses for super-resolution to preserve underlying physics and constrain the solution space of possible super-resolution outputs

    A Recurrent Mutation in KCNA2 as a Novel Cause of Hereditary Spastic Paraplegia and Ataxia

    Get PDF
    The hereditary spastic paraplegias (HSPs) are heterogeneous neurodegenerative disorders with over 50 known causative genes. We identified a recurrent mutation in KCNA2 (c.881G>A, p.R294H), encoding the voltage-gated K+-channel, K(V)1.2, in two unrelated families with HSP, intellectual disability (ID), and ataxia. Follow-up analysis of >2,000 patients with various neurological phenotypes identified a de novo p.R294H mutation in a proband with ataxia and ID. Two-electrode voltage-clamp recordings of Xenopus laevis oocytes expressing mutant KV1.2 channels showed loss of function with a dominant-negative effect. Our findings highlight the phenotypic spectrum of a recurrent KCNA2 mutation, implicating ion channel dysfunction as a novel HSP disease mechanism.Peer reviewe

    SLCO5A1 and synaptic assembly genes contribute to impulsivity in juvenile myoclonic epilepsy

    Get PDF
    Elevated impulsivity is a key component of attention-deficit hyperactivity disorder (ADHD), bipolar disorder and juvenile myoclonic epilepsy (JME). We performed a genome-wide association, colocalization, polygenic risk score, and pathway analysis of impulsivity in JME (n = 381). Results were followed up with functional characterisation using a drosophila model. We identified genome-wide associated SNPs at 8q13.3 (P = 7.5 × 10−9) and 10p11.21 (P = 3.6 × 10−8). The 8q13.3 locus colocalizes with SLCO5A1 expression quantitative trait loci in cerebral cortex (P = 9.5 × 10−3). SLCO5A1 codes for an organic anion transporter and upregulates synapse assembly/organisation genes. Pathway analysis demonstrates 12.7-fold enrichment for presynaptic membrane assembly genes (P = 0.0005) and 14.3-fold enrichment for presynaptic organisation genes (P = 0.0005) including NLGN1 and PTPRD. RNAi knockdown of Oatp30B, the Drosophila polypeptide with the highest homology to SLCO5A1, causes over-reactive startling behaviour (P = 8.7 × 10−3) and increased seizure-like events (P = 6.8 × 10−7). Polygenic risk score for ADHD genetically correlates with impulsivity scores in JME (P = 1.60 × 10−3). SLCO5A1 loss-of-function represents an impulsivity and seizure mechanism. Synaptic assembly genes may inform the aetiology of impulsivity in health and disease
    corecore