9 research outputs found
Gradient Estimation for Unseen Domain Risk Minimization with Pre-Trained Models
Domain generalization aims to build generalized models that perform well on
unseen domains when only source domains are available for model optimization.
Recent studies have shown that large-scale pre-trained models can enhance
domain generalization by leveraging their generalization power. However, these
pre-trained models lack target task-specific knowledge yet due to discrepancies
between the pre-training objectives and the target task. Although the
task-specific knowledge could be learned from source domains by fine-tuning,
this hurts the generalization power of pre-trained models due to gradient bias
toward the source domains. To alleviate this problem, we propose a new domain
generalization method that estimates unobservable gradients that reduce
potential risks in unseen domains using a large-scale pre-trained model. These
estimated unobservable gradients allow the pre-trained model to learn
task-specific knowledge further while preserving its generalization ability by
relieving the gradient bias. Our experimental results show that our method
outperforms baseline methods on DomainBed, a standard benchmark in domain
generalization. We also provide extensive analyses to demonstrate that the
pre-trained model can learn task-specific knowledge without sacrificing its
generalization power.Comment: ICCV2023 Workshop Versio
Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection
Recent studies on learning with noisy labels have shown remarkable
performance by exploiting a small clean dataset. In particular, model agnostic
meta-learning-based label correction methods further improve performance by
correcting noisy labels on the fly. However, there is no safeguard on the label
miscorrection, resulting in unavoidable performance degradation. Moreover,
every training step requires at least three back-propagations, significantly
slowing down the training speed. To mitigate these issues, we propose a robust
and efficient method that learns a label transition matrix on the fly.
Employing the transition matrix makes the classifier skeptical about all the
corrected samples, which alleviates the miscorrection issue. We also introduce
a two-head architecture to efficiently estimate the label transition matrix
every iteration within a single back-propagation, so that the estimated matrix
closely follows the shifting noise distribution induced by label correction.
Extensive experiments demonstrate that our approach shows the best performance
in training efficiency while having comparable or better accuracy than existing
methods.Comment: ECCV202
TiDAL: Learning Training Dynamics for Active Learning
Active learning (AL) aims to select the most useful data samples from an
unlabeled data pool and annotate them to expand the labeled dataset under a
limited budget. Especially, uncertainty-based methods choose the most uncertain
samples, which are known to be effective in improving model performance.
However, AL literature often overlooks training dynamics (TD), defined as the
ever-changing model behavior during optimization via stochastic gradient
descent, even though other areas of literature have empirically shown that TD
provides important clues for measuring the sample uncertainty. In this paper,
we propose a novel AL method, Training Dynamics for Active Learning (TiDAL),
which leverages the TD to quantify uncertainties of unlabeled data. Since
tracking the TD of all the large-scale unlabeled data is impractical, TiDAL
utilizes an additional prediction module that learns the TD of labeled data. To
further justify the design of TiDAL, we provide theoretical and empirical
evidence to argue the usefulness of leveraging TD for AL. Experimental results
show that our TiDAL achieves better or comparable performance on both balanced
and imbalanced benchmark datasets compared to state-of-the-art AL methods,
which estimate data uncertainty using only static information after model
training.Comment: ICCV 2023 Camera-Read
Reliable Decision from Multiple Subtasks through Threshold Optimization: Content Moderation in the Wild
Social media platforms struggle to protect users from harmful content through
content moderation. These platforms have recently leveraged machine learning
models to cope with the vast amount of user-generated content daily. Since
moderation policies vary depending on countries and types of products, it is
common to train and deploy the models per policy. However, this approach is
highly inefficient, especially when the policies change, requiring dataset
re-labeling and model re-training on the shifted data distribution. To
alleviate this cost inefficiency, social media platforms often employ
third-party content moderation services that provide prediction scores of
multiple subtasks, such as predicting the existence of underage personnel, rude
gestures, or weapons, instead of directly providing final moderation decisions.
However, making a reliable automated moderation decision from the prediction
scores of the multiple subtasks for a specific target policy has not been
widely explored yet. In this study, we formulate real-world scenarios of
content moderation and introduce a simple yet effective threshold optimization
method that searches the optimal thresholds of the multiple subtasks to make a
reliable moderation decision in a cost-effective way. Extensive experiments
demonstrate that our approach shows better performance in content moderation
compared to existing threshold optimization methods and heuristics.Comment: WSDM2023 (Oral Presentation
Crowdsourced mapping of unexplored target space of kinase inhibitors
Despite decades of intensive search for compounds that modulate the activity of particular protein targets, a large proportion of the human kinome remains as yet undrugged. Effective approaches are therefore required to map the massive space of unexplored compound-kinase interactions for novel and potent activities. Here, we carry out a crowdsourced benchmarking of predictive algorithms for kinase inhibitor potencies across multiple kinase families tested on unpublished bioactivity data. We find the top-performing predictions are based on various models, including kernel learning, gradient boosting and deep learning, and their ensemble leads to a predictive accuracy exceeding that of single-dose kinase activity assays. We design experiments based on the model predictions and identify unexpected activities even for under-studied kinases, thereby accelerating experimental mapping efforts. The open-source prediction algorithms together with the bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking prediction algorithms and for extending the druggable kinome. The IDG-DREAM Challenge carried out crowdsourced benchmarking of predictive algorithms for kinase inhibitor activities on unpublished data. This study provides a resource to compare emerging algorithms and prioritize new kinase activities to accelerate drug discovery and repurposing efforts