6 research outputs found
Prompt-Based Zero- and Few-Shot Node Classification: A Multimodal Approach
Multimodal data empowers machine learning models to better understand the
world from various perspectives. In this work, we study the combination of
\emph{text and graph} modalities, a challenging but understudied combination
which is prevalent across multiple settings including citation networks, social
media, and the web. We focus on the popular task of node classification using
limited labels; in particular, under the zero- and few-shot scenarios. In
contrast to the standard pipeline which feeds standard precomputed (e.g.,
bag-of-words) text features into a graph neural network, we propose
\textbf{T}ext-\textbf{A}nd-\textbf{G}raph (TAG) learning, a more deeply
multimodal approach that integrates the raw texts and graph topology into the
model design, and can effectively learn from limited supervised signals without
any meta-learning procedure. TAG is a two-stage model with (1) a prompt- and
graph-based module which generates prior logits that can be directly used for
zero-shot node classification, and (2) a trainable module that further
calibrates these prior logits in a few-shot manner. Experiments on two node
classification datasets show that TAG outperforms all the baselines by a large
margin in both zero- and few-shot settings.Comment: Work in progres
Demystifying Assumptions in Learning to Discover Novel Classes
In learning to discover novel classes (L2DNC), we are given labeled data from
seen classes and unlabeled data from unseen classes, and we train clustering
models for the unseen classes. However, the rigorous definition of L2DNC is
unexplored, which results in that its implicit assumptions are still unclear.
In this paper, we demystify assumptions behind L2DNC and find that high-level
semantic features should be shared among the seen and unseen classes. This
naturally motivates us to link L2DNC to meta-learning that has exactly the same
assumption as L2DNC. Based on this finding, L2DNC is not only theoretically
solvable, but can also be empirically solved by meta-learning algorithms after
slight modifications. This L2DNC methodology significantly reduces the amount
of unlabeled data needed for training and makes it more practical, as
demonstrated in experiments. The use of very limited data is also justified by
the application scenario of L2DNC: since it is unnatural to label only
seen-class data, L2DNC is sampling instead of labeling in causality. Therefore,
unseen-class data should be collected on the way of collecting seen-class data,
which is why they are novel and first need to be clustered
Методи навчання без підготовки засновані на вкладені в задачах комп'ютерного зору
Магістерська дисертація: 156 с., 23 табл., 37 рис., 24 джерела, 1 додаток.
Об’єкт дослідження – задача класифікації зображень з пов’язаними
семантичними ознаками.
Предмет дослідження – мережі для задач навчання без підготовки
засновані на вкладеннях та оцінка їх ефективності.
Мета роботи – Дослідити існуючі підходи до розробки мереж навчання
без підготовки, та визначити найбільш ефективні системи для моделювання.
В роботі проведено розглянуто проблему появи зображень нових класів,
які не були включені до навчання, під час використання класифікатора на
практиці. Розглянуті сучасні методи навчання без підготовки для класифікації
зображень засновані на вкладеннях. Розроблено програмний продукт для
класифікації зображень з семантичними ознаками методом навчання без
підготовки.
Основні наукові результати та їх новизна. Досліджено продуктивність
шести різних сучасних підходів для навчання без підготовки в умовах
наближених до практичних. Проведено експерименти, що показують
чутливість моделей до різного ступеня шумів в атрибутах та мітках, а також
чутливість до зменшення кількості навчальних класів та збільшення кількості
тестових.Master's thesis: 156 p., 23 tab., 37 fig., 24 ref., 1 appendix.
The object of the research is the task of classifying images with associated
semantic features.
The subject of the research is machine learning models for zero-shot
classification based on embeddings and evaluation of their effectiveness.
The purpose of the master's thesis is to investigate existing approaches to the
development of zero-shot models, and to determine the most effective architectures
for modeling.
The paper examines the problem of the appearance of images of new classes
that were not included in training phase during the use of the classifier in practice.
Modern methods of zero-shot learning based on embeddings for image classification
are considered. A software product has been developed for the classification of
images with semantic features with the zero shot learning algorithms.
Scientific results and their novelty. The performance of six different modern
approaches to zero-shot learning in conditions close to practical ones was studied.
Experiments were conducted showing the sensitivity of the models to varying
degrees of noise in attributes and labels, as well as sensitivity to a decrease in the
number of training classes and an increase in the number of test classes
Attribute propagation network for graph zero-shot learning
The goal of zero-shot learning (ZSL) is to train a model to classify samples of classes that were not seen during training. To address this challenging task, most ZSL methods relate unseen test classes to seen(training) classes via a predefined set of attributes that can describe all classes in the same semantic space, so the knowledge learned on the training classes can be adapted to unseen classes. In this paper, we aim to optimize the attribute space for ZSL by training a propagation mechanism to refine the semantic attributes of each class based on its neighbors and related classes on a graph of classes. We show that the propagated attributes can produce classifiers for zero-shot classes with significantly improved performance in different ZSL settings. The graph of classes is usually free or very cheap to acquire such as WordNet or ImageNet classes. When the graph is not provided, given predefined semantic embeddings of the classes, we can learn a mechanism to generate the graph in an end-to-end manner along with the propagation mechanism. However, this graph-aided technique has not been well-explored in the literature. In this paper, we introduce the “attribute propagation network (APNet)”, which is composed of 1) a graph propagation model generating attribute vector for each class and 2) a parameterized nearest neighbor (NN) classifier categorizing an image to the class with the nearest attribute vector to the image’s embedding. For better generalization over unseen classes, different from previous methods, we adopt a meta-learning strategy to train the propagation mechanism and the similarity metric for the NN classifier on multiple sub-graphs, each associated with a classification task over a subset of training classes. In experiments with two zero-shot learning settings and five benchmark datasets, APNet achieves either compelling performance or new state-of-the-art results