32 research outputs found
Active Discovery of Network Roles for Predicting the Classes of Network Nodes
Nodes in real world networks often have class labels, or underlying
attributes, that are related to the way in which they connect to other nodes.
Sometimes this relationship is simple, for instance nodes of the same class are
may be more likely to be connected. In other cases, however, this is not true,
and the way that nodes link in a network exhibits a different, more complex
relationship to their attributes. Here, we consider networks in which we know
how the nodes are connected, but we do not know the class labels of the nodes
or how class labels relate to the network links. We wish to identify the best
subset of nodes to label in order to learn this relationship between node
attributes and network links. We can then use this discovered relationship to
accurately predict the class labels of the rest of the network nodes.
We present a model that identifies groups of nodes with similar link
patterns, which we call network roles, using a generative blockmodel. The model
then predicts labels by learning the mapping from network roles to class labels
using a maximum margin classifier. We choose a subset of nodes to label
according to an iterative margin-based active learning strategy. By integrating
the discovery of network roles with the classifier optimisation, the active
learning process can adapt the network roles to better represent the network
for node classification. We demonstrate the model by exploring a selection of
real world networks, including a marine food web and a network of English
words. We show that, in contrast to other network classifiers, this model
achieves good classification accuracy for a range of networks with different
relationships between class labels and network links
Active-Learning-as-a-Service: An Efficient MLOps System for Data-Centric AI
The success of today's AI applications requires not only model training
(Model-centric) but also data engineering (Data-centric). In data-centric AI,
active learning (AL) plays a vital role, but current AL tools can not perform
AL tasks efficiently. To this end, this paper presents an efficient MLOps
system for AL, named ALaaS (Active-Learning-as-a-Service). Specifically, ALaaS
adopts a server-client architecture to support an AL pipeline and implements
stage-level parallelism for high efficiency. Meanwhile, caching and batching
techniques are employed to further accelerate the AL process. In addition to
efficiency, ALaaS ensures accessibility with the help of the design philosophy
of configuration-as-a-service. It also abstracts an AL process to several
components and provides rich APIs for advanced users to extend the system to
new scenarios. Extensive experiments show that ALaaS outperforms all other
baselines in terms of latency and throughput. Further ablation studies
demonstrate the effectiveness of our design as well as ALaaS's ease to use. Our
code is available at \url{https://github.com/MLSysOps/alaas}.Comment: 8 pages, 7 figure
More data means less inference: A pseudo-max approach to structured learning
The problem of learning to predict structured labels is of key importance in many applications. However, for general graph structure both learning and inference in this setting are intractable. Here we show that it is possible to circumvent this difficulty when the input distribution is rich enough via a method similar in spirit to pseudo-likelihood. We show how our new method achieves consistency, and illustrate empirically that it indeed performs as well as exact methods when sufficiently large training sets are used.United States-Israel Binational Science Foundation (Grant 2008303)Google (Firm) (Research Grant)Google (Firm) (PhD Fellowship
Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost
Active learning (AL) combines data labeling and model training to minimize
the labeling cost by prioritizing the selection of high value data that can
best improve model performance. In pool-based active learning, accessible
unlabeled data are not used for model training in most conventional methods.
Here, we propose to unify unlabeled sample selection and model training towards
minimizing labeling cost, and make two contributions towards that end. First,
we exploit both labeled and unlabeled data using semi-supervised learning (SSL)
to distill information from unlabeled data during the training stage. Second,
we propose a consistency-based sample selection metric that is coherent with
the training objective such that the selected samples are effective at
improving model performance. We conduct extensive experiments on image
classification tasks. The experimental results on CIFAR-10, CIFAR-100 and
ImageNet demonstrate the superior performance of our proposed method with
limited labeled data, compared to the existing methods and the alternative AL
and SSL combinations. Additionally, we study an important yet under-explored
problem -- "When can we start learning-based AL selection?". We propose a
measure that is empirically correlated with the AL target loss and is
potentially useful for determining the proper starting point of learning-based
AL methods.Comment: Accepted by ECCV202
Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography
Motivation: Cryo-Electron Tomography (cryo-ET) is a 3D bioimaging tool that
visualizes the structural and spatial organization of macromolecules at a
near-native state in single cells, which has broad applications in life
science. However, the systematic structural recognition and recovery of
macromolecules captured by cryo-ET are difficult due to high structural
complexity and imaging limits. Deep learning based subtomogram classification
have played critical roles for such tasks. As supervised approaches, however,
their performance relies on sufficient and laborious annotation on a large
training dataset.
Results: To alleviate this major labeling burden, we proposed a Hybrid Active
Learning (HAL) framework for querying subtomograms for labelling from a large
unlabeled subtomogram pool. Firstly, HAL adopts uncertainty sampling to select
the subtomograms that have the most uncertain predictions. Moreover, to
mitigate the sampling bias caused by such strategy, a discriminator is
introduced to judge if a certain subtomogram is labeled or unlabeled and
subsequently the model queries the subtomogram that have higher probabilities
to be unlabeled. Additionally, HAL introduces a subset sampling strategy to
improve the diversity of the query set, so that the information overlap is
decreased between the queried batches and the algorithmic efficiency is
improved. Our experiments on subtomogram classification tasks using both
simulated and real data demonstrate that we can achieve comparable testing
performance (on average only 3% accuracy drop) by using less than 30% of the
labeled subtomograms, which shows a very promising result for subtomogram
classification task with limited labeling resources.Comment: Statement on authorship changes: Dr. Eric Xing was an academic
advisor of Mr. Haohan Wang. Dr. Xing was not directly involved in this work
and has no direct interaction or collaboration with any other authors on this
work. Therefore, Dr. Xing is removed from the author list according to his
request. Mr. Zhenxi Zhu's affiliation is updated to his current affiliatio
Stochastic Adversarial Gradient Embedding for Active Domain Adaptation
Unsupervised Domain Adaptation (UDA) aims to bridge the gap between a source
domain, where labelled data are available, and a target domain only represented
with unlabelled data. If domain invariant representations have dramatically
improved the adaptability of models, to guarantee their good transferability
remains a challenging problem. This paper addresses this problem by using
active learning to annotate a small budget of target data. Although this setup,
called Active Domain Adaptation (ADA), deviates from UDA's standard setup, a
wide range of practical applications are faced with this situation. To this
purpose, we introduce \textit{Stochastic Adversarial Gradient Embedding}
(SAGE), a framework that makes a triple contribution to ADA. First, we select
for annotation target samples that are likely to improve the representations'
transferability by measuring the variation, before and after annotation, of the
transferability loss gradient. Second, we increase sampling diversity by
promoting different gradient directions. Third, we introduce a novel training
procedure for actively incorporating target samples when learning invariant
representations. SAGE is based on solid theoretical ground and validated on
various UDA benchmarks against several baselines. Our empirical investigation
demonstrates that SAGE takes the best of uncertainty \textit{vs} diversity
samplings and improves representations transferability substantially
Fine-Tuning Language Models via Epistemic Neural Networks
Large language models are now part of a powerful new paradigm in machine
learning. These models learn a wide range of capabilities from training on
large unsupervised text corpora. In many applications, these capabilities are
then fine-tuned through additional training on specialized data to improve
performance in that setting. In this paper, we augment these models with an
epinet: a small additional network architecture that helps to estimate model
uncertainty and form an epistemic neural network (ENN). ENNs are neural
networks that can know what they don't know. We show that, using an epinet to
prioritize uncertain data, we can fine-tune BERT on GLUE tasks to the same
performance while using 2x less data. We also investigate performance in
synthetic neural network generative models designed to build understanding. In
each setting, using an epinet outperforms heuristic active learning schemes