9,101 research outputs found
Projection based ensemble learning for ordinal regression
The classification of patterns into naturally ordered
labels is referred to as ordinal regression. This paper proposes
an ensemble methodology specifically adapted to this type of
problems, which is based on computing different classification
tasks through the formulation of different order hypotheses.
Every single model is trained in order to distinguish between
one given class (k) and all the remaining ones, but grouping
them in those classes with a rank lower than k, and those
with a rank higher than k. Therefore, it can be considered as
a reformulation of the well-known one-versus-all scheme. The
base algorithm for the ensemble could be any threshold (or
even probabilistic) method, such as the ones selected in this
paper: kernel discriminant analysis, support vector machines
and logistic regression (all reformulated to deal with ordinal
regression problems). The method is seen to be competitive when
compared with other state-of-the-art methodologies (both ordinal
and nominal), by using six measures and a total of fifteen ordinal
datasets. Furthermore, an additional set of experiments is used to
study the potential scalability and interpretability of the proposed
method when using logistic regression as base methodology for
the ensemble
Learning-to-Rank Meets Language: Boosting Language-Driven Ordering Alignment for Ordinal Classification
We present a novel language-driven ordering alignment method for ordinal
classification. The labels in ordinal classification contain additional
ordering relations, making them prone to overfitting when relying solely on
training data. Recent developments in pre-trained vision-language models
inspire us to leverage the rich ordinal priors in human language by converting
the original task into a visionlanguage alignment task. Consequently, we
propose L2RCLIP, which fully utilizes the language priors from two
perspectives. First, we introduce a complementary prompt tuning technique
called RankFormer, designed to enhance the ordering relation of original rank
prompts. It employs token-level attention with residual-style prompt blending
in the word embedding space. Second, to further incorporate language priors, we
revisit the approximate bound optimization of vanilla cross-entropy loss and
restructure it within the cross-modal embedding space. Consequently, we propose
a cross-modal ordinal pairwise loss to refine the CLIP feature space, where
texts and images maintain both semantic alignment and ordering alignment.
Extensive experiments on three ordinal classification tasks, including facial
age estimation, historical color image (HCI) classification, and aesthetic
assessment demonstrate its promising performance. The code is available at
https://github.com/raywang335/L2RCLIP.Comment: Accepted by NeurIPS 202
Non-Gaussian Discriminative Factor Models via the Max-Margin Rank-Likelihood
We consider the problem of discriminative factor analysis for data that are
in general non-Gaussian. A Bayesian model based on the ranks of the data is
proposed. We first introduce a new {\em max-margin} version of the
rank-likelihood. A discriminative factor model is then developed, integrating
the max-margin rank-likelihood and (linear) Bayesian support vector machines,
which are also built on the max-margin principle. The discriminative factor
model is further extended to the {\em nonlinear} case through mixtures of local
linear classifiers, via Dirichlet processes. Fully local conjugacy of the model
yields efficient inference with both Markov Chain Monte Carlo and variational
Bayes approaches. Extensive experiments on benchmark and real data demonstrate
superior performance of the proposed model and its potential for applications
in computational biology.Comment: 14 pages, 7 figures, ICML 201
Tree based Progressive Regression Model for Watch-Time Prediction in Short-video Recommendation
An accurate prediction of watch time has been of vital importance to enhance
user engagement in video recommender systems. To achieve this, there are four
properties that a watch time prediction framework should satisfy: first,
despite its continuous value, watch time is also an ordinal variable and the
relative ordering between its values reflects the differences in user
preferences. Therefore the ordinal relations should be reflected in watch time
predictions. Second, the conditional dependence between the video-watching
behaviors should be captured in the model. For instance, one has to watch half
of the video before he/she finishes watching the whole video. Third, modeling
watch time with a point estimation ignores the fact that models might give
results with high uncertainty and this could cause bad cases in recommender
systems. Therefore the framework should be aware of prediction uncertainty.
Forth, the real-life recommender systems suffer from severe bias amplifications
thus an estimation without bias amplification is expected. Therefore we propose
TPM for watch time prediction. Specifically, the ordinal ranks of watch time
are introduced into TPM and the problem is decomposed into a series of
conditional dependent classification tasks which are organized into a tree
structure. The expectation of watch time can be generated by traversing the
tree and the variance of watch time predictions is explicitly introduced into
the objective function as a measurement for uncertainty. Moreover, we
illustrate that backdoor adjustment can be seamlessly incorporated into TPM,
which alleviates bias amplifications. Extensive offline evaluations have been
conducted in public datasets and TPM have been deployed in a real-world video
app Kuaishou with over 300 million DAUs. The results indicate that TPM
outperforms state-of-the-art approaches and indeed improves video consumption
significantly
OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
This paper presents a language-powered paradigm for ordinal regression.
Existing methods usually treat each rank as a category and employ a set of
weights to learn these concepts. These methods are easy to overfit and usually
attain unsatisfactory performance as the learned concepts are mainly derived
from the training set. Recent large pre-trained vision-language models like
CLIP have shown impressive performance on various visual tasks. In this paper,
we propose to learn the rank concepts from the rich semantic CLIP latent space.
Specifically, we reformulate this task as an image-language matching problem
with a contrastive objective, which regards labels as text and obtains a
language prototype from a text encoder for each rank. While prompt engineering
for CLIP is extremely time-consuming, we propose OrdinalCLIP, a differentiable
prompting method for adapting CLIP for ordinal regression. OrdinalCLIP consists
of learnable context tokens and learnable rank embeddings; The learnable rank
embeddings are constructed by explicitly modeling numerical continuity,
resulting in well-ordered, compact language prototypes in the CLIP space. Once
learned, we can only save the language prototypes and discard the huge language
model, resulting in zero additional computational overhead compared with the
linear head counterpart. Experimental results show that our paradigm achieves
competitive performance in general ordinal regression tasks, and gains
improvements in few-shot and distribution shift settings for age estimation.
The code is available at https://github.com/xk-huang/OrdinalCLIP.Comment: Accepted by NeurIPS2022. Code is available at
https://github.com/xk-huang/OrdinalCLI
- …