19 research outputs found
Multi-Domain Adversarial Learning
International audienceMulti-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MULANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average-and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state-of-the-art on two standard image benchmarks, and a novel bioimage dataset, CELL
Introducing v0.5 of the AI Safety Benchmark from MLCommons
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark
Introducing v0.5 of the AI Safety Benchmark from MLCommons
This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark
The versatility of high-content high-throughput time-lapse screening data : developing generic methods for data re-use and comparative analyses
Un crible biologique a pour objectif de tester en parallèle l'impact de nombreuses conditions expérimentales sur un processus biologique d'un organisme modèle. Le progrès technique et computationnel a rendu possible la réalisation de tels cribles à grande échelle - jusqu'à des centaines de milliers de conditions. L'imagerie sur cellules vivantes est un excellent outil pour étudier en détail les conséquences d'une perturbation chimique sur un processus biologique. L'analyse des cribles sur cellules vivantes demande toutefois la combinaison de méthodes robustes d'imagerie par ordinateur et de contrôle qualité, et d'approches statistiques efficaces pour la détection des effets significatifs. La présente thèse répond à ces défis par le développement de méthodes analytiques pour les images de cribles temporels à haut débit. Les cadres qui y sont développés sont appliqués à des données publiées, démontrant par là leur applicabilité ainsi que les bénéfices d'une ré-analyse des données de cribles à haut contenu (HCS). Le premier workflow pour l'étude de la motilité cellulaire à l'échelle d'une cellule dans de telles données constitue le chapitre 2. Le chapitre 3 applique ce workflow à des données publiées et présente une nouvelle distance pour l'inférence de cible thérapeutique à partir d'images de cribles temporels. Enfin, le chapitre 4 présente une pipeline méthodologique complète pour la conduite de cribles temporels à haut débit en toxicologie environnementale.Biological screens test large sets of experimental conditions with respect to their specific biological effect on living systems. Technical and computational progresses have made it possible to perform such screens at a large scale - up to hundreds of thousands of experiments. Live cell imaging is an excellent tool to study in detail the consequences of chemical perturbation on a given biological process. However, the analysis of live cell screens demands the combination of robust computer vision methods, efficient statistical methods for the detection of significant effects and robust procedures for quality control. This thesis addresses these challenges by developing analytical methods for the analysis of High Throughput time-lapse microscopy screening data. The developed frameworks are applied to publicly available HCS data, demonstrating their applicability and the benefits of HCS data remining. The first multivariate workflow for the study of single cell motility in such large-scale data is detailed in Chapter 2. Chapter 3 presents this workflow application to previously published data, and the development of a new distance for drug target inference by in silico comparisons of parallel siRNA and drug screens. Finally, chapter 4 presents a complete methodological pipeline for performing HT time-lapse screens in Environmental Toxicology
Développement de méthodes pour les données de cribles temporels à haut contenu et haut débit : versatilité et analyses comparatives
Biological screens test large sets of experimental conditions with respect to their specific biological effect on living systems. Technical and computational progresses have made it possible to perform such screens at a large scale - up to hundreds of thousands of experiments. Live cell imaging is an excellent tool to study in detail the consequences of chemical perturbation on a given biological process. However, the analysis of live cell screens demands the combination of robust computer vision methods, efficient statistical methods for the detection of significant effects and robust procedures for quality control. This thesis addresses these challenges by developing analytical methods for the analysis of High Throughput time-lapse microscopy screening data. The developed frameworks are applied to publicly available HCS data, demonstrating their applicability and the benefits of HCS data remining. The first multivariate workflow for the study of single cell motility in such large-scale data is detailed in Chapter 2. Chapter 3 presents this workflow application to previously published data, and the development of a new distance for drug target inference by in silico comparisons of parallel siRNA and drug screens. Finally, chapter 4 presents a complete methodological pipeline for performing HT time-lapse screens in Environmental Toxicology.Un crible biologique a pour objectif de tester en parallèle l'impact de nombreuses conditions expérimentales sur un processus biologique d'un organisme modèle. Le progrès technique et computationnel a rendu possible la réalisation de tels cribles à grande échelle - jusqu'à des centaines de milliers de conditions. L'imagerie sur cellules vivantes est un excellent outil pour étudier en détail les conséquences d'une perturbation chimique sur un processus biologique. L'analyse des cribles sur cellules vivantes demande toutefois la combinaison de méthodes robustes d'imagerie par ordinateur et de contrôle qualité, et d'approches statistiques efficaces pour la détection des effets significatifs. La présente thèse répond à ces défis par le développement de méthodes analytiques pour les images de cribles temporels à haut débit. Les cadres qui y sont développés sont appliqués à des données publiées, démontrant par là leur applicabilité ainsi que les bénéfices d'une ré-analyse des données de cribles à haut contenu (HCS). Le premier workflow pour l'étude de la motilité cellulaire à l'échelle d'une cellule dans de telles données constitue le chapitre 2. Le chapitre 3 applique ce workflow à des données publiées et présente une nouvelle distance pour l'inférence de cible thérapeutique à partir d'images de cribles temporels. Enfin, le chapitre 4 présente une pipeline méthodologique complète pour la conduite de cribles temporels à haut débit en toxicologie environnementale
Stochastic Gradient Descent: Going As Fast As Possible But Not Faster
Short version of https://arxiv.org/abs/1709.01427International audienceWhen applied to training deep neural networks, stochastic gradient descent (SGD) often incurs steady progression phases, interrupted by catastrophic episodes in which loss and gradient norm explode. A possible mitigation of such events is to slow down the learning process. This paper presents a novel approach, called SALERA, to control the SGD learning rate, that uses two statistical tests. The first one, aimed at fast learning, compares the momentum of the normalized gradient vectors to that of random unit vectors and accordingly gracefully increases or decreases the learning rate. The second one is a change point detection test, aimed at the detection of catastrophic learning episodes; upon its triggering the learning rate is instantly halved. Experiments on standard benchmarks show that SALERA performs well in practice, and compares favorably to the state of the art. Machine Learning (ML) algorithms require efficient optimization techniques, whether to solve convex problems (e.g., for SVMs), or non-convex ones (e.g., for Neural Networks). As the data size and the model dimensionality increase, mainstream convex optimization methods are adversely affected. Overall, Stochastic Gradient Descent (SGD) is increasingly adopted. Within the SGD framework, one of the main issues is to know how to control the learning rate.The adequate speed depends both on the current state of the system (the weight vector) and the current mini-batch. Often, the eventual convergence of SGD is ensured by decaying the learning rate as O(t) [23, 6] or O(√ t) [29]. While learning rate decay effectively prevents catastrophic events (sudden rocketing of the training loss and gradient norm), many and diverse approaches have been designed to achieve quicker convergence through learning rate adaptation [1, 7, 24, 16, 26, 2] (more in Section 1). This paper proposes a novel approach to adaptive SGD, called SALERA (Safe Agnostic LEarning Rate Adaptation), based on the conjecture that, if catastrophes are well taken care of, the learning process can speed up whenever successive gradient directions show more correlation than random. The frequent advent of catastrophic episodes [11, Chapter 8], [3] raises the question of how to best mitigate their impact. Framing catastrophic episodes as random events,we adopt a purely curative strategy: detecting and instantly curing catastrophic episodes. Formally, a sequential cumulative sum change detection test, the Page-Hinkley (PH) test [20, 14] is adapted and used to monitor the learning curve reporting the minibatch losses. If a change in the learning curve is detected, the system undergoes an instant cure by halving the learning rate and backtracking to its former state.Once the risk of catastrophic episodes is well addressed, the learning rate can be adapted in a more agile manner: the ALERA (Agnostic LEarning Rate Adaptation) process increases (resp. decreases) the learning rate whenever the correlation among successive gradient directions is higher (resp. lower) than random, by comparing the actual gradient momentum and the agnostic momentum built from random unit vectors. The contribution of the paper is twofold. First, it proposes an original and efficient way to control learning dynamics (section 2.1). Secondly, it opens a new approach for handling catastrophic events and salvaging a OPTML 2017: 10th NIPS Workshop on Optimization for Machine Learning (NIPS 2017)
Multi-Domain Adversarial Learning
International audienceMulti-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MULANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average-and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state-of-the-art on two standard image benchmarks, and a novel bioimage dataset, CELL
Recommended from our members
Multi-Domain Adversarial Learning
Multi-domain learning (MDL) aims at obtaining a model with minimal average
risk across multiple domains. Our empirical motivation is automated microscopy
data, where cultured cells are imaged after being exposed to known and unknown
chemical perturbations, and each dataset displays significant experimental
bias. This paper presents a multi-domain adversarial learning approach, MuLANN,
to leverage multiple datasets with overlapping but distinct class sets, in a
semi-supervised setting. Our contributions include: i) a bound on the average-
and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss
to accommodate semi-supervised multi-domain learning and domain adaptation;
iii) the experimental validation of the approach, improving on the state of the
art on two standard image benchmarks, and a novel bioimage dataset, Cell