9 research outputs found

    Quantum Kernel Mixtures for Probabilistic Deep Learning

    Full text link
    This paper presents a novel approach to probabilistic deep learning (PDL), quantum kernel mixtures, derived from the mathematical formalism of quantum density matrices, which provides a simpler yet effective mechanism for representing joint probability distributions of both continuous and discrete random variables. The framework allows for the construction of differentiable models for density estimation, inference, and sampling, enabling integration into end-to-end deep neural models. In doing so, we provide a versatile representation of marginal and joint probability distributions that allows us to develop a differentiable, compositional, and reversible inference procedure that covers a wide range of machine learning tasks, including density estimation, discriminative learning, and generative modeling. We illustrate the broad applicability of the framework with two examples: an image classification model, which can be naturally transformed into a conditional generative model thanks to the reversibility of our inference procedure; and a model for learning with label proportions, which is a weakly supervised classification task, demonstrating the framework's ability to deal with uncertainty in the training samples

    Exploring Generalisability of Self-Distillation with No Labels for SAR-Based Vegetation Prediction

    Full text link
    In this work we pre-train a DINO-ViT based model using two Synthetic Aperture Radar datasets (S1GRD or GSSIC) across three regions (China, Conus, Europe). We fine-tune the models on smaller labeled datasets to predict vegetation percentage, and empirically study the connection between the embedding space of the models and their ability to generalize across diverse geographic regions and to unseen data. For S1GRD, embedding spaces of different regions are clearly separated, while GSSIC's overlaps. Positional patterns remain during fine-tuning, and greater distances in embeddings often result in higher errors for unfamiliar regions. With this, our work increases our understanding of generalizability for self-supervised models applied to remote sensing.Comment: 10 pages, 9 figure

    Large Scale Masked Autoencoding for Reducing Label Requirements on SAR Data

    Full text link
    Satellite-based remote sensing is instrumental in the monitoring and mitigation of the effects of anthropogenic climate change. Large scale, high resolution data derived from these sensors can be used to inform intervention and policy decision making, but the timeliness and accuracy of these interventions is limited by use of optical data, which cannot operate at night and is affected by adverse weather conditions. Synthetic Aperture Radar (SAR) offers a robust alternative to optical data, but its associated complexities limit the scope of labelled data generation for traditional deep learning. In this work, we apply a self-supervised pretraining scheme, masked autoencoding, to SAR amplitude data covering 8.7\% of the Earth's land surface area, and tune the pretrained weights on two downstream tasks crucial to monitoring climate change - vegetation cover prediction and land cover classification. We show that the use of this pretraining scheme reduces labelling requirements for the downstream tasks by more than an order of magnitude, and that this pretraining generalises geographically, with the performance gain increasing when tuned downstream on regions outside the pretraining set. Our findings significantly advance climate change mitigation by facilitating the development of task and region-specific SAR models, allowing local communities and organizations to deploy tailored solutions for rapid, accurate monitoring of climate change effects.Comment: 12 pages, 6 figure

    Fewshot learning on global multimodal embeddings for earth observation tasks

    Full text link
    In this work we pretrain a CLIP/ViT based model using three different modalities of satellite imagery across five AOIs covering over ~10\% of Earth's total landmass, namely Sentinel 2 RGB optical imagery, Sentinel 1 SAR radar amplitude and interferometric coherence. This model uses 250\sim 250 M parameters. Then, we use the embeddings produced for each modality with a classical machine learning method to attempt different downstream tasks for earth observation related to vegetation, built up surface, croplands and permanent water. We consistently show how we reduce the need for labeled data by 99\%, so that with ~200-500 randomly selected labeled examples (around 4K-10K km2^2) we reach performance levels analogous to those achieved with the full labeled datasets (about 150K image chips or 3M km2^2 in each area of interest - AOI) on all modalities, AOIs and downstream tasks. This leads us to think that the model has captured significant earth features useful in a wide variety of scenarios. To enhance our model's usability in practice, its architecture allows inference in contexts with missing modalities and even missing channels within each modality. Additionally, we visually show that this embedding space, obtained with no labels, is sensible to the different earth features represented by the labelled datasets we selected.Comment: 9 pages, 6 figures, presented on NeurIPS workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Model

    Exploring DINO: Emergent Properties and Limitations for Synthetic Aperture Radar Imagery

    Full text link
    Self-supervised learning (SSL) models have recently demonstrated remarkable performance across various tasks, including image segmentation. This study delves into the emergent characteristics of the Self-Distillation with No Labels (DINO) algorithm and its application to Synthetic Aperture Radar (SAR) imagery. We pre-train a vision transformer (ViT)-based DINO model using unlabeled SAR data, and later fine-tune the model to predict high-resolution land cover maps. We rigorously evaluate the utility of attention maps generated by the ViT backbone and compare them with the model's token embedding space. We observe a small improvement in model performance with pre-training compared to training from scratch and discuss the limitations and opportunities of SSL for remote sensing and land cover segmentation. Beyond small performance increases, we show that ViT attention maps hold great intrinsic value for remote sensing, and could provide useful inputs to other algorithms. With this, our work lays the groundwork for bigger and better SSL models for Earth Observation.Comment: 9 pages, 5 figure

    Efficient Non-parametric Neural Density Estimation and Its Application to Outlier and Anomaly Detection

    No full text
    The main goal of this thesis is to develop efficient non-parametric density estimation methods that can be integrated with deep learning architectures, for instance, convolutional neural networks and transformers. Density estimation methods can be applied to different problems in statistics and machine learning. They may be used to solve tasks such as anomaly detection, generative models, semi-supervised learning, compression, text-to-speech, among others. The present work will mainly focus on the application of the method in anomaly and outlier detection tasks such as medical anomaly detection, fraud detection, video surveillance, time series anomaly detection, industrial damage detection, among others. A recent approach to non-parametric density estimation is neural density estimation. One advantage of these methods is that they can be integrated with deep learning architectures and trained using gradient descent. Most of these methods are based on neural network implementations of normalizing flows which transform an original simpler distribution to a more complex one. The approach of this thesis is based on a different idea that combines random Fourier features with density matrices to estimate the underlying distribution function. The method can be seen as an approximation of the popular kernel density estimation method but without the inherent computational cost

    Estimación neuronal no paramétrica eficiente de la densidad y su aplicación a la detección de valores atípicos y anomalías

    No full text
    ilustraciones, diagramasThe main goal of this thesis is to propose efficient non-parametric density estimation methods that can be integrated with deep learning architectures, for instance, convolutional neural networks and transformers. A recent approach to non-parametric density estimation is neural density estimation. One advantage of these methods is that they can be integrated with deep learning architectures and trained using gradient descent. Most of these methods are based on neural network implementations of normalizing flows which transform an original simpler distribution to a more complex one. The approach of this thesis is based on a different idea that combines random Fourier features with density matrices to estimate the underlying distribution function. The method can be seen as an approximation of the popular kernel density estimation method but without the inherent computational cost. Density estimation methods can be applied to different problems in statistics and machine learning. They may be used to solve tasks such as anomaly detection, generative models, semi-supervised learning, compression, text-to-speech, among others. This thesis explores the application of the method in anomaly and outlier detection tasks such as medical anomaly detection, fraud detection, video surveillance, time series anomaly detection, industrial damage detection, among others. (Texto tomado de la fuente)El objetivo principal de esta tesis es proponer m´etodos eficientes de estimaci´on de densidad no param´etrica que puedan integrarse con arquitecturas de aprendizaje profundo, por ejemplo, redes neuronales convolucionales y transformadores. Una aproximaci´on reciente a la estimaci´on no param´etrica de la densidad es la estimaci´on de la densidad usando redes neuronales. Una de las ventajas de estos m´etodos es que pueden integrarse con arquitecturas de aprendizaje profundo y entrenarse mediante gradiente descendente. La mayor´ıa de estos m´etodos se basan en implementaciones de redes neuronales de flujos de normalizaci´on que transforman una distribuci´on original m´as simple en una m´as compleja. El enfoque de esta tesis se basa en una nueva idea diferente que combina caracter´ısticas aleatorias de Fourier con matrices de densidad para estimar la funci´on de distribuci´on subyacente. El m´etodo puede considerarse una aproximaci´on al popular m´etodo kernel density estimation, pero sin el coste computacional inherente. Los m´etodos de estimaci´on de la densidad pueden aplicarse a diferentes problemas en estad´ıstica y aprendizaje autom´atico. Pueden ser utilizados para resolver tareas como la detecci´on de anomal´ıas, modelos generativos, aprendizaje semi-supervisado, compresi´on, texto a habla, entre otros. El presente trabajo se centra principalmente en la aplicaci´on del m´etodo en tareas de detecci´on de anomal´ıas y valores at´ıpicos como la detecci´on de anomal´ıas m´edicas, la detecci´on de fraudes, la videovigilancia, la detecci´on de anomal´ıas en series temporales, la detecci´on de da˜nos industriales, entre otras.DoctoradoDoctor en IngenieríaMachine Learnin

    LEAN-DMKDE: Quantum Latent Density Estimation for Anomaly Detection (Student Abstract)

    No full text
    This paper presents an anomaly detection model that combines the strong statistical foundation of density-estimation-based anomaly detection methods with the representation-learning ability of deep-learning models. The method combines an autoencoder, that learns a low-dimensional representation of the data, with a density-estimation model based on density matrices in an end-to-end architecture that can be trained using gradient-based optimization techniques. A systematic experimental evaluation was performed on different benchmark datasets. The experimental results show that the method is able to outperform other state-of-the-art methods

    Geoeconomic variations in epidemiology, ventilation management, and outcomes in invasively ventilated intensive care unit patients without acute respiratory distress syndrome: a pooled analysis of four observational studies

    No full text
    Background: Geoeconomic variations in epidemiology, the practice of ventilation, and outcome in invasively ventilated intensive care unit (ICU) patients without acute respiratory distress syndrome (ARDS) remain unexplored. In this analysis we aim to address these gaps using individual patient data of four large observational studies. Methods: In this pooled analysis we harmonised individual patient data from the ERICC, LUNG SAFE, PRoVENT, and PRoVENT-iMiC prospective observational studies, which were conducted from June, 2011, to December, 2018, in 534 ICUs in 54 countries. We used the 2016 World Bank classification to define two geoeconomic regions: middle-income countries (MICs) and high-income countries (HICs). ARDS was defined according to the Berlin criteria. Descriptive statistics were used to compare patients in MICs versus HICs. The primary outcome was the use of low tidal volume ventilation (LTVV) for the first 3 days of mechanical ventilation. Secondary outcomes were key ventilation parameters (tidal volume size, positive end-expiratory pressure, fraction of inspired oxygen, peak pressure, plateau pressure, driving pressure, and respiratory rate), patient characteristics, the risk for and actual development of acute respiratory distress syndrome after the first day of ventilation, duration of ventilation, ICU length of stay, and ICU mortality. Findings: Of the 7608 patients included in the original studies, this analysis included 3852 patients without ARDS, of whom 2345 were from MICs and 1507 were from HICs. Patients in MICs were younger, shorter and with a slightly lower body-mass index, more often had diabetes and active cancer, but less often chronic obstructive pulmonary disease and heart failure than patients from HICs. Sequential organ failure assessment scores were similar in MICs and HICs. Use of LTVV in MICs and HICs was comparable (42·4% vs 44·2%; absolute difference -1·69 [-9·58 to 6·11] p=0·67; data available in 3174 [82%] of 3852 patients). The median applied positive end expiratory pressure was lower in MICs than in HICs (5 [IQR 5-8] vs 6 [5-8] cm H2O; p=0·0011). ICU mortality was higher in MICs than in HICs (30·5% vs 19·9%; p=0·0004; adjusted effect 16·41% [95% CI 9·52-23·52]; p<0·0001) and was inversely associated with gross domestic product (adjusted odds ratio for a US$10 000 increase per capita 0·80 [95% CI 0·75-0·86]; p<0·0001). Interpretation: Despite similar disease severity and ventilation management, ICU mortality in patients without ARDS is higher in MICs than in HICs, with a strong association with country-level economic status
    corecore