24,831 research outputs found
Hyperbolic Interaction Model For Hierarchical Multi-Label Classification
Different from the traditional classification tasks which assume mutual
exclusion of labels, hierarchical multi-label classification (HMLC) aims to
assign multiple labels to every instance with the labels organized under
hierarchical relations. Besides the labels, since linguistic ontologies are
intrinsic hierarchies, the conceptual relations between words can also form
hierarchical structures. Thus it can be a challenge to learn mappings from word
hierarchies to label hierarchies. We propose to model the word and label
hierarchies by embedding them jointly in the hyperbolic space. The main reason
is that the tree-likeness of the hyperbolic space matches the complexity of
symbolic data with hierarchical structures. A new Hyperbolic Interaction Model
(HyperIM) is designed to learn the label-aware document representations and
make predictions for HMLC. Extensive experiments are conducted on three
benchmark datasets. The results have demonstrated that the new model can
realistically capture the complex data structures and further improve the
performance for HMLC comparing with the state-of-the-art methods. To facilitate
future research, our code is publicly available
Multi-Target Prediction: A Unifying View on Problems and Methods
Multi-target prediction (MTP) is concerned with the simultaneous prediction
of multiple target variables of diverse type. Due to its enormous application
potential, it has developed into an active and rapidly expanding research field
that combines several subfields of machine learning, including multivariate
regression, multi-label classification, multi-task learning, dyadic prediction,
zero-shot learning, network inference, and matrix completion. In this paper, we
present a unifying view on MTP problems and methods. First, we formally discuss
commonalities and differences between existing MTP problems. To this end, we
introduce a general framework that covers the above subfields as special cases.
As a second contribution, we provide a structured overview of MTP methods. This
is accomplished by identifying a number of key properties, which distinguish
such methods and determine their suitability for different types of problems.
Finally, we also discuss a few challenges for future research
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding
Scientific literature understanding tasks have gained significant attention
due to their potential to accelerate scientific discovery. Pre-trained language
models (LMs) have shown effectiveness in these tasks, especially when tuned via
contrastive learning. However, jointly utilizing pre-training data across
multiple heterogeneous tasks (e.g., extreme multi-label paper classification,
citation prediction, and literature search) remains largely unexplored. To
bridge this gap, we propose a multi-task contrastive learning framework,
SciMult, with a focus on facilitating common knowledge sharing across different
scientific literature understanding tasks while preventing task-specific skills
from interfering with each other. To be specific, we explore two techniques --
task-aware specialization and instruction tuning. The former adopts a
Mixture-of-Experts Transformer architecture with task-aware sub-layers; the
latter prepends task-specific instructions to the input text so as to produce
task-aware outputs. Extensive experiments on a comprehensive collection of
benchmark datasets verify the effectiveness of our task-aware specialization
strategy, where we outperform state-of-the-art scientific pre-trained LMs.
Code, datasets, and pre-trained models can be found at
https://scimult.github.io/.Comment: 17 pages; Accepted to Findings of EMNLP 2023 (Project Page:
https://scimult.github.io/
The Emerging Trends of Multi-Label Learning
Exabytes of data are generated daily by humans, leading to the growing need
for new efforts in dealing with the grand challenges for multi-label learning
brought by big data. For example, extreme multi-label classification is an
active and rapidly growing research area that deals with classification tasks
with an extremely large number of classes or labels; utilizing massive data
with limited supervision to build a multi-label classification model becomes
valuable for practical applications, etc. Besides these, there are tremendous
efforts on how to harvest the strong learning capability of deep learning to
better capture the label dependencies in multi-label learning, which is the key
for deep learning to address real-world classification tasks. However, it is
noted that there has been a lack of systemic studies that focus explicitly on
analyzing the emerging trends and new challenges of multi-label learning in the
era of big data. It is imperative to call for a comprehensive survey to fulfill
this mission and delineate future research directions and new applications.Comment: Accepted to TPAMI 202
GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification
In natural language processing, extreme multi-label text classification is an
emerging but essential task. The problem of extreme multi-label text
classification (XMTC) is to recall some of the most relevant labels for a text
from an extremely large label set. Large-scale pre-trained models have brought
a new trend to this problem. Though the large-scale pre-trained models have
made significant achievements on this problem, the valuable fine-tuned methods
have yet to be studied. Though label semantics have been introduced in XMTC,
the vast semantic gap between texts and labels has yet to gain enough
attention. This paper builds a new guide network (GUDN) to help fine-tune the
pre-trained model to instruct classification later. Furthermore, GUDN uses raw
label semantics combined with a helpful label reinforcement strategy to
effectively explore the latent space between texts and labels, narrowing the
semantic gap, which can further improve predicted accuracy. Experimental
results demonstrate that GUDN outperforms state-of-the-art methods on Eurlex-4k
and has competitive results on other popular datasets. In an additional
experiment, we investigated the input lengths' influence on the
Transformer-based model's accuracy. Our source code is released at
https://t.hk.uy/aFSH.Comment: 12 pages, 6 figure
- …