98 research outputs found

    Multi-Source Domain Adaptation meets Dataset Distillation through Dataset Dictionary Learning

    Full text link
    In this paper, we consider the intersection of two problems in machine learning: Multi-Source Domain Adaptation (MSDA) and Dataset Distillation (DD). On the one hand, the first considers adapting multiple heterogeneous labeled source domains to an unlabeled target domain. On the other hand, the second attacks the problem of synthesizing a small summary containing all the information about the datasets. We thus consider a new problem called MSDA-DD. To solve it, we adapt previous works in the MSDA literature, such as Wasserstein Barycenter Transport and Dataset Dictionary Learning, as well as DD method Distribution Matching. We thoroughly experiment with this novel problem on four benchmarks (Caltech-Office 10, Tennessee-Eastman Process, Continuous Stirred Tank Reactor, and Case Western Reserve University), where we show that, even with as little as 1 sample per class, one achieves state-of-the-art adaptation performance.Comment: 7 pages,4 figure

    Recent Advances in Optimal Transport for Machine Learning

    Full text link
    Recently, Optimal Transport has been proposed as a probabilistic framework in Machine Learning for comparing and manipulating probability distributions. This is rooted in its rich history and theory, and has offered new solutions to different problems in machine learning, such as generative modeling and transfer learning. In this survey we explore contributions of Optimal Transport for Machine Learning over the period 2012 -- 2022, focusing on four sub-fields of Machine Learning: supervised, unsupervised, transfer and reinforcement learning. We further highlight the recent development in computational Optimal Transport, and its interplay with Machine Learning practice.Comment: 20 pages,5 figures,under revie

    Multi-Source Domain Adaptation through Dataset Dictionary Learning in Wasserstein Space

    Full text link
    This paper seeks to solve Multi-Source Domain Adaptation (MSDA), which aims to mitigate data distribution shifts when transferring knowledge from multiple labeled source domains to an unlabeled target domain. We propose a novel MSDA framework based on dictionary learning and optimal transport. We interpret each domain in MSDA as an empirical distribution. As such, we express each domain as a Wasserstein barycenter of dictionary atoms, which are empirical distributions. We propose a novel algorithm, DaDiL, for learning via mini-batches: (i) atom distributions; (ii) a matrix of barycentric coordinates. Based on our dictionary, we propose two novel methods for MSDA: DaDil-R, based on the reconstruction of labeled samples in the target domain, and DaDiL-E, based on the ensembling of classifiers learned on atom distributions. We evaluate our methods in 3 benchmarks: Caltech-Office, Office 31, and CRWU, where we improved previous state-of-the-art by 3.15%, 2.29%, and 7.71% in classification performance. Finally, we show that interpolations in the Wasserstein hull of learned atoms provide data that can generalize to the target domain.Comment: 13 pages,8 figures,Accepted as a conference paper at the 26th European Conference on Artificial Intelligenc

    Devenir de polluants émergents lors d'un traitement photochimique ou photocatylitique sous irradiation solaire

    Get PDF
    L industrialisation et l utilisation dans la vie courante d un nombre croissant de produits chimiques et médicamenteux sont responsables de la dissémination dans l environnement de substances variées nommées polluants émergents . Les traitements des eaux usées existants ne sont pas conçus pour éliminer cette catégorie de pollution et les polluants émergents sont alors rejetés dans le milieu récepteur. Une possible solution pour limiter le rejet de ces composés par les effluents de station d épuration serait l utilisation de procédés de traitement additionnels tels que les procédés d oxydation avancés. C est dans ce contexte qu a démarré le projet Européen Clean Water en 2009 associant 7 entités dont le GEPEA-Ecole des Mines de Nantes. Le concept du projet est de développer des procédés photocatalytiques mettant en œuvre des nanomatériaux actifs sous la lumière solaire. Ces procédés visent à éliminer les polluants émergents tels que les perturbateurs endocriniens ou les produits pharmaceutiques. Dans ce programme, le laboratoire GEPEA est concerné par l évaluation de l efficacité des matériaux vis-à-vis de l élimination des polluants émergents sous irradiations UV et visibles. Pour cela, une méthodologie expérimentale a été établie de façon à exprimer les performances des catalyseurs testés en termes de constantes cinétiques de dégradation, de taux de conversion et de minéralisation des molécules étudiées mais aussi en fonction de la formation de produits intermédiaires. Ces performances sont également évaluées en termes de biodégradabilité, d effet de toxicité et de perturbation endocrinienne des produits intermédiaires. Dans un premier temps, la méthodologie expérimentale établie a été testée sur la dégradation de la tétracycline en utilisant un catalyseur de référence puis, elle a été appliquée sur la dégradation respective du bisphénol A et de la 17b-oestradiol en utilisant un catalyseur de référence et les catalyseurs élaborés dans le cadre du projet Clean Water. Les résultats sur la dégradation de la tétracycline ont montré que i) les intermédiaires réactionnels sont moins toxiques que la tétracycline, ii) la structure des intermédiaires réactionnelles est semblable à celle de la tétracycline ce qui explique la faible biodégradabilité de ces intermédiaires. Concernant la dégradation du bisphénol A et de la 17b-oestradiol, les résultats ont montré que i) les catalyseurs sont efficaces sous irradiation solaire simulée. Cependant, l efficacité photocatalytique du catalyseur dépend du composé à dégrader, ii) la nature des intermédiaires réactionnels identifiés du bisphénol A dépend du catalyseur utilisé, iii) l effet œstrogénique de la solution d oestradiol persiste au cours du traitement photocatalytique.Industrialisation, the use of numerous chemical products in domestic activities and the use of medicine drugs have led to the release in the environment of various substances named "emerging pollutants . The existing wastewater treatments are not designed to eliminate this kind of pollution and then these pollutants are released into the natural aquatic media. To limit the release of these compounds by waste water treatment plant effluent, a solution could be the use of additional treatment processes such as advanced oxidation processes. In this context, the European project Clean Water has started in 2009. Clean Water involves 7 entities including the GEPEA laboratory-Ecole des Mines de Nantes. The aim of the Clean Water project is to develop sustainable and cost effective water treatment and detoxification processes using TiO2 nanomaterials with UV-visible light response under solar light. These processes act to remove emerging contaminants such as endocrine disruptors and pharmaceuticals. In this program, theGEPEA laboratory is concerned with the evaluation of the efficiency of novel photocatalysts under UV and visible irradiations for the elimination of emerging pollutants. For this purpose, an experimental methodology was established to express the efficiency of the tested catalysts in terms of degradation kinetic constants, pollutants conversion and mineralisation and also in terms of the intermediate products formed. The efficiency of photocatalysts is also evaluated in terms of intermediates biodegradability, toxicity and endocrine disruption effects. First, the experimental methodology was tested on the degradation of tetracycline with a reference catalyst. Then, it was applied to the degradation of bisphenol A and estradiol respectively with the reference catalyst and the catalysts developed within the Clean Water Project. The results obtained on the tetracycline degradation have showed that: i) tetracycline intermediate products are less toxic than tetracycline ii) the intermediates structure is similar to that of tetracycline, this can explain the low biodegradability observed for these intermediates. For the degradation of bisphenol A and estradiol, the results showed that: i) the photocatalysts are efficient under simulated solar irradiation. However, the catalyst photocatalytic efficiency depends on the compound to be degraded ii) the nature of the bisphenol A reaction intermediates identified depends on the catalyst used iii)the estrogenic effect of the estradiol treated solution persists during the photocatalytic treatment.NANTES-ENS Mines (441092314) / SudocSudocFranceF

    Multi-Source Domain Adaptation for Cross-Domain Fault Diagnosis of Chemical Processes

    Full text link
    Fault diagnosis is an essential component in process supervision. Indeed, it determines which kind of fault has occurred, given that it has been previously detected, allowing for appropriate intervention. Automatic fault diagnosis systems use machine learning for predicting the fault type from sensor readings. Nonetheless, these models are sensible to changes in the data distributions, which may be caused by changes in the monitored process, such as changes in the mode of operation. This scenario is known as Cross-Domain Fault Diagnosis (CDFD). We provide an extensive comparison of single and multi-source unsupervised domain adaptation (SSDA and MSDA respectively) algorithms for CDFD. We study these methods in the context of the Tennessee-Eastmann Process, a widely used benchmark in the chemical industry. We show that using multiple domains during training has a positive effect, even when no adaptation is employed. As such, the MSDA baseline improves over the SSDA baseline classification accuracy by 23% on average. In addition, under the multiple-sources scenario, we improve classification accuracy of the no adaptation setting by 8.4% on average.Comment: 18 pages,15 figure

    Galaxy Image Restoration with Shape Constraint

    Full text link
    Images acquired with a telescope are blurred and corrupted by noise. The blurring is usually modeled by a convolution with the Point Spread Function and the noise by Additive Gaussian Noise. Recovering the observed image is an ill-posed inverse problem. Sparse deconvolution is well known to be an efficient deconvolution technique, leading to optimized pixel Mean Square Errors, but without any guarantee that the shapes of objects (e.g. galaxy images) contained in the data will be preserved. In this paper, we introduce a new shape constraint and exhibit its properties. By combining it with a standard sparse regularization in the wavelet domain, we introduce the Shape COnstraint REstoration algorithm (SCORE), which performs a standard sparse deconvolution, while preserving galaxy shapes. We show through numerical experiments that this new approach leads to a reduction of galaxy ellipticity measurement errors by at least 44%.Comment: 22 pages, 6 figures, 1 table, accepted in Journal of Fourier Analysis and Application
    corecore