628 research outputs found
Learning Discriminative Stein Kernel for SPD Matrices and Its Applications
Stein kernel has recently shown promising performance on classifying images
represented by symmetric positive definite (SPD) matrices. It evaluates the
similarity between two SPD matrices through their eigenvalues. In this paper,
we argue that directly using the original eigenvalues may be problematic
because: i) Eigenvalue estimation becomes biased when the number of samples is
inadequate, which may lead to unreliable kernel evaluation; ii) More
importantly, eigenvalues only reflect the property of an individual SPD matrix.
They are not necessarily optimal for computing Stein kernel when the goal is to
discriminate different sets of SPD matrices. To address the two issues in one
shot, we propose a discriminative Stein kernel, in which an extra parameter
vector is defined to adjust the eigenvalues of the input SPD matrices. The
optimal parameter values are sought by optimizing a proxy of classification
performance. To show the generality of the proposed method, three different
kernel learning criteria that are commonly used in the literature are employed
respectively as a proxy. A comprehensive experimental study is conducted on a
variety of image classification tasks to compare our proposed discriminative
Stein kernel with the original Stein kernel and other commonly used methods for
evaluating the similarity between SPD matrices. The experimental results
demonstrate that, the discriminative Stein kernel can attain greater
discrimination and better align with classification tasks by altering the
eigenvalues. This makes it produce higher classification performance than the
original Stein kernel and other commonly used methods.Comment: 13 page
Use the Detection Transformer as a Data Augmenter
Detection Transformer (DETR) is a Transformer architecture based object
detection model. In this paper, we demonstrate that it can also be used as a
data augmenter. We term our approach as DETR assisted CutMix, or DeMix for
short. DeMix builds on CutMix, a simple yet highly effective data augmentation
technique that has gained popularity in recent years. CutMix improves model
performance by cutting and pasting a patch from one image onto another,
yielding a new image. The corresponding label for this new example is specified
as the weighted average of the original labels, where the weight is
proportional to the area of the patches. CutMix selects a random patch to be
cut. In contrast, DeMix elaborately selects a semantically rich patch, located
by a pre-trained DETR. The label of the new image is specified in the same way
as in CutMix. Experimental results on benchmark datasets for image
classification demonstrate that DeMix significantly outperforms prior art data
augmentation methods including CutMix.Comment: 13 page
Learning Discriminative Bayesian Networks from High-dimensional Continuous Neuroimaging Data
Due to its causal semantics, Bayesian networks (BN) have been widely employed
to discover the underlying data relationship in exploratory studies, such as
brain research. Despite its success in modeling the probability distribution of
variables, BN is naturally a generative model, which is not necessarily
discriminative. This may cause the ignorance of subtle but critical network
changes that are of investigation values across populations. In this paper, we
propose to improve the discriminative power of BN models for continuous
variables from two different perspectives. This brings two general
discriminative learning frameworks for Gaussian Bayesian networks (GBN). In the
first framework, we employ Fisher kernel to bridge the generative models of GBN
and the discriminative classifiers of SVMs, and convert the GBN parameter
learning to Fisher kernel learning via minimizing a generalization error bound
of SVMs. In the second framework, we employ the max-margin criterion and build
it directly upon GBN models to explicitly optimize the classification
performance of the GBNs. The advantages and disadvantages of the two frameworks
are discussed and experimentally compared. Both of them demonstrate strong
power in learning discriminative parameters of GBNs for neuroimaging based
brain network analysis, as well as maintaining reasonable representation
capacity. The contributions of this paper also include a new Directed Acyclic
Graph (DAG) constraint with theoretical guarantee to ensure the graph validity
of GBN.Comment: 16 pages and 5 figures for the article (excluding appendix
OPML: A One-Pass Closed-Form Solution for Online Metric Learning
To achieve a low computational cost when performing online metric learning
for large-scale data, we present a one-pass closed-form solution namely OPML in
this paper. Typically, the proposed OPML first adopts a one-pass triplet
construction strategy, which aims to use only a very small number of triplets
to approximate the representation ability of whole original triplets obtained
by batch-manner methods. Then, OPML employs a closed-form solution to update
the metric for new coming samples, which leads to a low space (i.e., )
and time (i.e., ) complexity, where is the feature dimensionality.
In addition, an extension of OPML (namely COPML) is further proposed to enhance
the robustness when in real case the first several samples come from the same
class (i.e., cold start problem). In the experiments, we have systematically
evaluated our methods (OPML and COPML) on three typical tasks, including UCI
data classification, face verification, and abnormal event detection in videos,
which aims to fully evaluate the proposed methods on different sample number,
different feature dimensionalities and different feature extraction ways (i.e.,
hand-crafted and deeply-learned). The results show that OPML and COPML can
obtain the promising performance with a very low computational cost. Also, the
effectiveness of COPML under the cold start setting is experimentally verified.Comment: 12 page
A Novel Unsupervised Camera-aware Domain Adaptation Framework for Person Re-identification
Unsupervised cross-domain person re-identification (Re-ID) faces two key
issues. One is the data distribution discrepancy between source and target
domains, and the other is the lack of labelling information in target domain.
They are addressed in this paper from the perspective of representation
learning. For the first issue, we highlight the presence of camera-level
sub-domains as a unique characteristic of person Re-ID, and develop
camera-aware domain adaptation to reduce the discrepancy not only between
source and target domains but also across these sub-domains. For the second
issue, we exploit the temporal continuity in each camera of target domain to
create discriminative information. This is implemented by dynamically
generating online triplets within each batch, in order to maximally take
advantage of the steadily improved feature representation in training process.
Together, the above two methods give rise to a novel unsupervised deep domain
adaptation framework for person Re-ID. Experiments and ablation studies on
benchmark datasets demonstrate its superiority and interesting properties.Comment: Accepted by ICCV201
Subject-adaptive Integration of Multiple SICE Brain Networks with Different Sparsity
As a principled method for partial correlation estimation, sparse inverse covariance estimation (SICE) has been employed to model brain connectivity networks, which holds great promise for brain disease diagnosis. For each subject, the SICE method naturally leads to a set of connectivity networks with various sparsity. However, existing methods usually select a single network from them for classification and the discriminative power of this set of networks has not been fully exploited. This paper argues that the connectivity networks at different sparsity levels present complementary connectivity patterns and therefore they should be jointly considered to achieve high classification performance.In this paper, we propose a subject-adaptive method to integrate multiple SICE networks as a unified representation for classification. The integration weight is learned adaptively for each subject in order to endow the method with the flexibility in dealing with subject variations. Furthermore, to respect the manifold geometry of SICE networks, Stein kernel is employed to embed the manifold structure into a kernel-induced feature space, which allows a linear integration of SICE networks to be designed. The optimization of the integration weight and the classification of the integrated networks are performed via a sparse representation framework. Through our method, we provide a unified and effective network representation that is transparent to the sparsity level of SICE networks, and can be readily utilized for further medical analysis. Experimental study on ADHD and ADNI data sets demonstrates that the proposed integration method achieves notable improvement of classification performance in comparison with methods using a single sparsity level of SICE networks and other commonly used integration methods, such as Multiple Kernel Learning
METransformer: Radiology Report Generation by Transformer with Multiple Learnable Expert Tokens
In clinical scenarios, multi-specialist consultation could significantly
benefit the diagnosis, especially for intricate cases. This inspires us to
explore a "multi-expert joint diagnosis" mechanism to upgrade the existing
"single expert" framework commonly seen in the current literature. To this end,
we propose METransformer, a method to realize this idea with a
transformer-based backbone. The key design of our method is the introduction of
multiple learnable "expert" tokens into both the transformer encoder and
decoder. In the encoder, each expert token interacts with both vision tokens
and other expert tokens to learn to attend different image regions for image
representation. These expert tokens are encouraged to capture complementary
information by an orthogonal loss that minimizes their overlap. In the decoder,
each attended expert token guides the cross-attention between input words and
visual tokens, thus influencing the generated report. A metrics-based expert
voting strategy is further developed to generate the final report. By the
multi-experts concept, our model enjoys the merits of an ensemble-based
approach but through a manner that is computationally more efficient and
supports more sophisticated interactions among experts. Experimental results
demonstrate the promising performance of our proposed model on two widely used
benchmarks. Last but not least, the framework-level innovation makes our work
ready to incorporate advances on existing "single-expert" models to further
improve its performance.Comment: Accepted by CVPR202
Conjugate Gradient Algorithm for the Symmetric Arrowhead Solution of Matrix Equation AXB=C
Based on the conjugate gradient (CG) algorithm, the constrained matrix equation AXB=C and the associate optimal approximation problem are considered for the symmetric arrowhead matrix solutions in the premise of consistency. The convergence results of the method are presented. At last, a numerical example is given to illustrate the efficiency of this method
- …