28 research outputs found
Justifying Information-Geometric Causal Inference
Information Geometric Causal Inference (IGCI) is a new approach to
distinguish between cause and effect for two variables. It is based on an
independence assumption between input distribution and causal mechanism that
can be phrased in terms of orthogonality in information space. We describe two
intuitive reinterpretations of this approach that makes IGCI more accessible to
a broader audience.
Moreover, we show that the described independence is related to the
hypothesis that unsupervised learning and semi-supervised learning only works
for predicting the cause from the effect and not vice versa.Comment: 3 Figure
Universum Prescription: Regularization using Unlabeled Data
This paper shows that simply prescribing "none of the above" labels to
unlabeled data has a beneficial regularization effect to supervised learning.
We call it universum prescription by the fact that the prescribed labels cannot
be one of the supervised labels. In spite of its simplicity, universum
prescription obtained competitive results in training deep convolutional
networks for CIFAR-10, CIFAR-100, STL-10 and ImageNet datasets. A qualitative
justification of these approaches using Rademacher complexity is presented. The
effect of a regularization parameter -- probability of sampling from unlabeled
data -- is also studied empirically.Comment: 7 pages for article, 3 pages for supplemental material. To appear in
AAAI-1
Applicability of semi-supervised learning assumptions for gene ontology terms prediction
Gene Ontology (GO) is one of the most important resources in bioinformatics, aiming to provide a unified framework for the biological annotation of genes and proteins across all species. Predicting GO terms is an essential task for bioinformatics, but the number of available labelled proteins is in several cases insufficient for training reliable machine learning classifiers. Semi-supervised learning methods arise as a powerful solution that explodes the information contained in unlabelled data in order to improve the estimations of traditional supervised approaches. However, semi-supervised learning methods have to make strong assumptions about the nature of the training data and thus, the performance of the predictor is highly dependent on these assumptions. This paper presents an analysis of the applicability of semi-supervised learning assumptions over the specific task of GO terms prediction, focused on providing judgment elements that allow choosing the most suitable tools for specific GO terms. The results show that semi-supervised approaches significantly outperform the traditional supervised methods and that the highest performances are reached when applying the cluster assumption. Besides, it is experimentally demonstrated that cluster and manifold assumptions are complimentary to each other and an analysis of which GO terms can be more prone to be correctly predicted with each assumption, is provided.Postprint (published version
Exploiting Universum data in AdaBoost using gradient descent
Recently, Universum data that does not belong to any class of the training data, has been applied for training better classifiers. In this paper, we address a novel boosting algorithm called UAdaBoost that can improve the classification performance of AdaBoost with Universum data. UAdaBoost chooses a function by minimizing the loss for labeled data and Universum data. The cost function is minimized by a greedy, stagewise, functional gradient procedure. Each training stage of UAdaBoost is fast and efficient. The standard AdaBoost weights labeled samples during training iterations while UAdaBoost gives an explicit weighting scheme for Universum samples as well. In addition, this paper describes the practical conditions for the effectiveness of Universum learning. These conditions are based on the analysis of the distribution of ensemble predictions over training samples. Experiments on handwritten digits classification and gender classification problems are presented. As exhibited by our experimental results, the proposed method can obtain superior performances over the standard AdaBoost by selecting proper Universum data. © 2014 Elsevier B.V
All Beings Are Equal in Open Set Recognition
In open-set recognition (OSR), a promising strategy is exploiting
pseudo-unknown data outside given known classes as an additional +-th
class to explicitly model potential open space. However, treating unknown
classes without distinction is unequal for them relative to known classes due
to the category-agnostic and scale-agnostic of the unknowns. This inevitably
not only disrupts the inherent distributions of unknown classes but also incurs
both class-wise and instance-wise imbalances between known and unknown classes.
Ideally, the OSR problem should model the whole class space as +,
but enumerating all unknowns is impractical. Since the core of OSR is to
effectively model the boundaries of known classes, this means just focusing on
the unknowns nearing the boundaries of targeted known classes seems sufficient.
Thus, as a compromise, we convert the open classes from infinite to , with a
novel concept Target-Aware Universum (TAU) and propose a simple yet effective
framework Dual Contrastive Learning with Target-Aware Universum (DCTAU). In
details, guided by the targeted known classes, TAU automatically expands the
unknown classes from the previous to , effectively alleviating the
distribution disruption and the imbalance issues mentioned above. Then, a novel
Dual Contrastive (DC) loss is designed, where all instances irrespective of
known or TAU are considered as positives to contrast with their respective
negatives. Experimental results indicate DCTAU sets a new state-of-the-art.Comment: Accepted by the main track The 38th Annual AAAI Conference on
Artificial Intelligence (AAAI 2024
A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning
Current deep learning research is dominated by benchmark evaluation. A method
is regarded as favorable if it empirically performs well on the dedicated test
set. This mentality is seamlessly reflected in the resurfacing area of
continual learning, where consecutively arriving sets of benchmark data are
investigated. The core challenge is framed as protecting previously acquired
representations from being catastrophically forgotten due to the iterative
parameter updates. However, comparison of individual methods is nevertheless
treated in isolation from real world application and typically judged by
monitoring accumulated test set performance. The closed world assumption
remains predominant. It is assumed that during deployment a model is guaranteed
to encounter data that stems from the same distribution as used for training.
This poses a massive challenge as neural networks are well known to provide
overconfident false predictions on unknown instances and break down in the face
of corrupted data. In this work we argue that notable lessons from open set
recognition, the identification of statistically deviating data outside of the
observed dataset, and the adjacent field of active learning, where data is
incrementally queried such that the expected performance gain is maximized, are
frequently overlooked in the deep learning era. Based on these forgotten
lessons, we propose a consolidated view to bridge continual learning, active
learning and open set recognition in deep neural networks. Our results show
that this not only benefits each individual paradigm, but highlights the
natural synergies in a common framework. We empirically demonstrate
improvements when alleviating catastrophic forgetting, querying data in active
learning, selecting task orders, while exhibiting robust open world application
where previously proposed methods fail.Comment: 32 page