88,978 research outputs found
Large-scale Multi-label Text Classification - Revisiting Neural Networks
Neural networks have recently been proposed for multi-label classification
because they are able to capture and model label dependencies in the output
layer. In this work, we investigate limitations of BP-MLL, a neural network
(NN) architecture that aims at minimizing pairwise ranking error. Instead, we
propose to use a comparably simple NN approach with recently proposed learning
techniques for large-scale multi-label text classification tasks. In
particular, we show that BP-MLL's ranking loss minimization can be efficiently
and effectively replaced with the commonly used cross entropy error function,
and demonstrate that several advances in neural network training that have been
developed in the realm of deep learning can be effectively employed in this
setting. Our experimental results show that simple NN models equipped with
advanced techniques such as rectified linear units, dropout, and AdaGrad
perform as well as or even outperform state-of-the-art approaches on six
large-scale textual datasets with diverse characteristics.Comment: 16 pages, 4 figures, submitted to ECML 201
Learning Deep Latent Spaces for Multi-Label Classification
Multi-label classification is a practical yet challenging task in machine
learning related fields, since it requires the prediction of more than one
label category for each input instance. We propose a novel deep neural networks
(DNN) based model, Canonical Correlated AutoEncoder (C2AE), for solving this
task. Aiming at better relating feature and label domain data for improved
classification, we uniquely perform joint feature and label embedding by
deriving a deep latent space, followed by the introduction of label-correlation
sensitive loss function for recovering the predicted label outputs. Our C2AE is
achieved by integrating the DNN architectures of canonical correlation analysis
and autoencoder, which allows end-to-end learning and prediction with the
ability to exploit label dependency. Moreover, our C2AE can be easily extended
to address the learning problem with missing labels. Our experiments on
multiple datasets with different scales confirm the effectiveness and
robustness of our proposed method, which is shown to perform favorably against
state-of-the-art methods for multi-label classification.Comment: published in AAAI-201
Neuro-symbolic Rule Learning in Real-world Classification Tasks
Neuro-symbolic rule learning has attracted lots of attention as it offers
better interpretability than pure neural models and scales better than symbolic
rule learning. A recent approach named pix2rule proposes a neural Disjunctive
Normal Form (neural DNF) module to learn symbolic rules with feed-forward
layers. Although proved to be effective in synthetic binary classification,
pix2rule has not been applied to more challenging tasks such as multi-label and
multi-class classifications over real-world data. In this paper, we address
this limitation by extending the neural DNF module to (i) support rule learning
in real-world multi-class and multi-label classification tasks, (ii) enforce
the symbolic property of mutual exclusivity (i.e. predicting exactly one class)
in multi-class classification, and (iii) explore its scalability over large
inputs and outputs. We train a vanilla neural DNF model similar to pix2rule's
neural DNF module for multi-label classification, and we propose a novel
extended model called neural DNF-EO (Exactly One) which enforces mutual
exclusivity in multi-class classification. We evaluate the classification
performance, scalability and interpretability of our neural DNF-based models,
and compare them against pure neural models and a state-of-the-art symbolic
rule learner named FastLAS. We demonstrate that our neural DNF-based models
perform similarly to neural networks, but provide better interpretability by
enabling the extraction of logical rules. Our models also scale well when the
rule search space grows in size, in contrast to FastLAS, which fails to learn
in multi-class classification tasks with 200 classes and in all multi-label
settings.Comment: Accepted at AAAI-MAKE 202
Towards Improved Imbalance Robustness in Continual Multi-Label Learning with Dual Output Spiking Architecture (DOSA)
Algorithms designed for addressing typical supervised classification problems
can only learn from a fixed set of samples and labels, making them unsuitable
for the real world, where data arrives as a stream of samples often associated
with multiple labels over time. This motivates the study of task-agnostic
continual multi-label learning problems. While algorithms using deep learning
approaches for continual multi-label learning have been proposed in the recent
literature, they tend to be computationally heavy. Although spiking neural
networks (SNNs) offer a computationally efficient alternative to artificial
neural networks, existing literature has not used SNNs for continual
multi-label learning. Also, accurately determining multiple labels with SNNs is
still an open research problem. This work proposes a dual output spiking
architecture (DOSA) to bridge these research gaps. A novel imbalance-aware loss
function is also proposed, improving the multi-label classification performance
of the model by making it more robust to data imbalance. A modified F1 score is
presented to evaluate the effectiveness of the proposed loss function in
handling imbalance. Experiments on several benchmark multi-label datasets show
that DOSA trained with the proposed loss function shows improved robustness to
data imbalance and obtains better continual multi-label learning performance
than CIFDM, a previous state-of-the-art algorithm.Comment: 8 pages, 4 figures, 4 tables, 45 references. Submitted to IJCNN 202
Multi-label Node Classification On Graph-Structured Data
Graph Neural Networks (GNNs) have shown state-of-the-art improvements in node
classification tasks on graphs. While these improvements have been largely
demonstrated in a multi-class classification scenario, a more general and
realistic scenario in which each node could have multiple labels has so far
received little attention. The first challenge in conducting focused studies on
multi-label node classification is the limited number of publicly available
multi-label graph datasets. Therefore, as our first contribution, we collect
and release three real-world biological datasets and develop a multi-label
graph generator to generate datasets with tunable properties. While high label
similarity (high homophily) is usually attributed to the success of GNNs, we
argue that a multi-label scenario does not follow the usual semantics of
homophily and heterophily so far defined for a multi-class scenario. As our
second contribution, besides defining homophily for the multi-label scenario,
we develop a new approach that dynamically fuses the feature and label
correlation information to learn label-informed representations. Finally, we
perform a large-scale comparative study with methods and datasets
which also showcase the effectiveness of our approach. We release our benchmark
at \url{https://anonymous.4open.science/r/LFLF-5D8C/}
Bag-of-Words vs. Sequence vs. Graph vs. Hierarchy for Single- and Multi-Label Text Classification
Graph neural networks have triggered a resurgence of graph-based text
classification methods, defining today's state of the art. We show that a
simple multi-layer perceptron (MLP) using a Bag of Words (BoW) outperforms the
recent graph-based models TextGCN and HeteGCN in an inductive text
classification setting and is comparable with HyperGAT in single-label
classification. We also run our own experiments on multi-label classification,
where the simple MLP outperforms the recent sequential-based gMLP and aMLP
models. Moreover, we fine-tune a sequence-based BERT and a lightweight
DistilBERT model, which both outperform all models on both single-label and
multi-label settings in most datasets. These results question the importance of
synthetic graphs used in modern text classifiers. In terms of parameters,
DistilBERT is still twice as large as our BoW-based wide MLP, while graph-based
models like TextGCN require setting up an graph, where
is the vocabulary plus corpus size.Comment: arXiv admin note: substantial text overlap with arXiv:2109.0377
Transferring CNNs to Multi-instance Multi-label Classification on Small Datasets
Image tagging is a well known challenge in image processing. It is typically addressed through multi-instance multi-label (MIML) classification methodologies. Convolutional Neural Networks (CNNs) possess great potential to perform well on MIML tasks, since multi-level convolution and max pooling coincide with the multi-instance setting and the sharing of hidden representation may benefit multi-label modeling. However, CNNs usually require a large amount of carefully labeled data for training, which is hard to obtain in many real applications. In this paper, we propose a new approach for transferring pre-trained deep networks such as VGG16 on Imagenet to small MIML tasks. We extract features from each group of the network layers and apply multiple binary classifiers to them for multi-label prediction. Moreover, we adopt an L1-norm regularized Logistic Regression (L1LR) to find the most effective features for learning the multi-label classifiers. The experiment results on two most-widely used and relatively small benchmark MIML image datasets demonstrate that the proposed approach can substantially outperform the state-of-the-art algorithms, in terms of all popular performance metrics
- …