2,989 research outputs found
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Deep Grassmann Manifold Optimization for Computer Vision
In this work, we propose methods that advance four areas in the field of computer vision: dimensionality reduction, deep feature embeddings, visual domain adaptation, and deep neural network compression. We combine concepts from the fields of manifold geometry and deep learning to develop cutting edge methods in each of these areas. Each of the methods proposed in this work achieves state-of-the-art results in our experiments. We propose the Proxy Matrix Optimization (PMO) method for optimization over orthogonal matrix manifolds, such as the Grassmann manifold. This optimization technique is designed to be highly flexible enabling it to be leveraged in many situations where traditional manifold optimization methods cannot be used.
We first use PMO in the field of dimensionality reduction, where we propose an iterative optimization approach to Principal Component Analysis (PCA) in a framework called Proxy Matrix optimization based PCA (PM-PCA). We also demonstrate how PM-PCA can be used to solve the general -PCA problem, a variant of PCA that uses arbitrary fractional norms, which can be more robust to outliers. We then present Cascaded Projection (CaP), a method which uses tensor compression based on PMO, to reduce the number of filters in deep neural networks. This, in turn, reduces the number of computational operations required to process each image with the network. Cascaded Projection is the first end-to-end trainable method for network compression that uses standard backpropagation to learn the optimal tensor compression. In the area of deep feature embeddings, we introduce Deep Euclidean Feature Representations through Adaptation on the Grassmann manifold (DEFRAG), that leverages PMO. The DEFRAG method improves the feature embeddings learned by deep neural networks through the use of auxiliary loss functions and Grassmann manifold optimization. Lastly, in the area of visual domain adaptation, we propose the Manifold-Aligned Label Transfer for Domain Adaptation (MALT-DA) to transfer knowledge from samples in a known domain to an unknown domain based on cross-domain cluster correspondences
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts
Machine learning methods strive to acquire a robust model during training
that can generalize well to test samples, even under distribution shifts.
However, these methods often suffer from a performance drop due to unknown test
distributions. Test-time adaptation (TTA), an emerging paradigm, has the
potential to adapt a pre-trained model to unlabeled data during testing, before
making predictions. Recent progress in this paradigm highlights the significant
benefits of utilizing unlabeled data for training self-adapted models prior to
inference. In this survey, we divide TTA into several distinct categories,
namely, test-time (source-free) domain adaptation, test-time batch adaptation,
online test-time adaptation, and test-time prior adaptation. For each category,
we provide a comprehensive taxonomy of advanced algorithms, followed by a
discussion of different learning scenarios. Furthermore, we analyze relevant
applications of TTA and discuss open challenges and promising areas for future
research. A comprehensive list of TTA methods can be found at
\url{https://github.com/tim-learn/awesome-test-time-adaptation}.Comment: Discussions, comments, and questions are all welcomed in
\url{https://github.com/tim-learn/awesome-test-time-adaptation
Efficient Deformable Shape Correspondence via Kernel Matching
We present a method to match three dimensional shapes under non-isometric
deformations, topology changes and partiality. We formulate the problem as
matching between a set of pair-wise and point-wise descriptors, imposing a
continuity prior on the mapping, and propose a projected descent optimization
procedure inspired by difference of convex functions (DC) programming.
Surprisingly, in spite of the highly non-convex nature of the resulting
quadratic assignment problem, our method converges to a semantically meaningful
and continuous mapping in most of our experiments, and scales well. We provide
preliminary theoretical analysis and several interpretations of the method.Comment: Accepted for oral presentation at 3DV 2017, including supplementary
materia
ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse
The rapid expansion of foundation pre-trained models and their fine-tuned
counterparts has significantly contributed to the advancement of machine
learning. Leveraging pre-trained models to extract knowledge and expedite
learning in real-world tasks, known as "Model Reuse", has become crucial in
various applications. Previous research focuses on reusing models within a
certain aspect, including reusing model weights, structures, and hypothesis
spaces. This paper introduces ZhiJian, a comprehensive and user-friendly
toolbox for model reuse, utilizing the PyTorch backend. ZhiJian presents a
novel paradigm that unifies diverse perspectives on model reuse, encompassing
target architecture construction with PTM, tuning target model with PTM, and
PTM-based inference. This empowers deep learning practitioners to explore
downstream tasks and identify the complementary advantages among different
methods. ZhiJian is readily accessible at
https://github.com/zhangyikaii/lamda-zhijian facilitating seamless utilization
of pre-trained models and streamlining the model reuse process for researchers
and developers
- …