8 research outputs found
Self-Supervised Learning for Domain Adaptation on Point-Clouds
Self-supervised learning (SSL) allows to learn useful representations from
unlabeled data and has been applied effectively for domain adaptation (DA) on
images. It is still unknown if and how it can be leveraged for domain
adaptation for 3D perception. Here we describe the first study of SSL for DA on
point clouds. We introduce a new family of pretext tasks, \textit{Deformation
Reconstruction}, motivated by the deformations encountered in sim-to-real
transformations. The key idea is to deform regions of the input shape and use a
neural network to reconstruct them. We design three types of shape deformation
methods: (1) \textit{Volume-based:} shape deformation based on proximity in the
input space; (2) \textit{Feature-based:} deforming regions in the shape that
are semantically similar; and (3) \textit{Sampling-based:} shape deformation
based on three simple sampling schemes. As a separate contribution, we also
develop a new method based on the Mixup training procedure for point-clouds.
Evaluations on six domain adaptations across synthetic and real furniture data,
demonstrate large improvement over previous work
De-Confusing Pseudo-Labels in Source-Free Domain Adaptation
Source-free domain adaptation (SFDA) aims to adapt a source-trained model to
an unlabeled target domain without access to the source data. SFDA has
attracted growing attention in recent years, where existing approaches focus on
self-training that usually includes pseudo-labeling techniques. In this paper,
we introduce a novel noise-learning approach tailored to address noise
distribution in domain adaptation settings and learn to de-confuse the
pseudo-labels. More specifically, we learn a noise transition matrix of the
pseudo-labels to capture the label corruption of each class and learn the
underlying true label distribution. Estimating the noise transition matrix
enables a better true class-posterior estimation, resulting in better
prediction accuracy. We demonstrate the effectiveness of our approach when
combined with several SFDA methods: SHOT, SHOT++, and AaD. We obtain
state-of-the-art results on three domain adaptation datasets: VisDA, DomainNet,
and OfficeHome
Equivariant Architectures for Learning in Deep Weight Spaces
Designing machine learning architectures for processing neural networks in
their raw weight matrix form is a newly introduced research direction.
Unfortunately, the unique symmetry structure of deep weight spaces makes this
design very challenging. If successful, such architectures would be capable of
performing a wide range of intriguing tasks, from adapting a pre-trained
network to a new domain to editing objects represented as functions (INRs or
NeRFs). As a first step towards this goal, we present here a novel network
architecture for learning in deep weight spaces. It takes as input a
concatenation of weights and biases of a pre-trained MLP and processes it using
a composition of layers that are equivariant to the natural permutation
symmetry of the MLP's weights: Changing the order of neurons in intermediate
layers of the MLP does not affect the function it represents. We provide a full
characterization of all affine equivariant and invariant layers for these
symmetries and show how these layers can be implemented using three basic
operations: pooling, broadcasting, and fully connected layers applied to the
input in an appropriate manner. We demonstrate the effectiveness of our
architecture and its advantages over natural baselines in a variety of learning
tasks.Comment: ICML 202
Data Augmentations in Deep Weight Spaces
Learning in weight spaces, where neural networks process the weights of other
deep neural networks, has emerged as a promising research direction with
applications in various fields, from analyzing and editing neural fields and
implicit neural representations, to network pruning and quantization. Recent
works designed architectures for effective learning in that space, which takes
into account its unique, permutation-equivariant, structure. Unfortunately, so
far these architectures suffer from severe overfitting and were shown to
benefit from large datasets. This poses a significant challenge because
generating data for this learning setup is laborious and time-consuming since
each data sample is a full set of network weights that has to be trained. In
this paper, we address this difficulty by investigating data augmentations for
weight spaces, a set of techniques that enable generating new data examples on
the fly without having to train additional input weight space elements. We
first review several recently proposed data augmentation schemes %that were
proposed recently and divide them into categories. We then introduce a novel
augmentation scheme based on the Mixup method. We evaluate the performance of
these techniques on existing benchmarks as well as new benchmarks we generate,
which can be valuable for future studies.Comment: Accepted to NeurIPS 2023 Workshop on Symmetry and Geometry in Neural
Representation
Guided Deep Kernel Learning
Combining Gaussian processes with the expressive power of deep neural
networks is commonly done nowadays through deep kernel learning (DKL).
Unfortunately, due to the kernel optimization process, this often results in
losing their Bayesian benefits. In this study, we present a novel approach for
learning deep kernels by utilizing infinite-width neural networks. We propose
to use the Neural Network Gaussian Process (NNGP) model as a guide to the DKL
model in the optimization process. Our approach harnesses the reliable
uncertainty estimation of the NNGPs to adapt the DKL target confidence when it
encounters novel data points. As a result, we get the best of both worlds, we
leverage the Bayesian behavior of the NNGP, namely its robustness to
overfitting, and accurate uncertainty estimation, while maintaining the
generalization abilities, scalability, and flexibility of deep kernels.
Empirically, we show on multiple benchmark datasets of varying sizes and
dimensionality, that our method is robust to overfitting, has good predictive
performance, and provides reliable uncertainty estimations
Multi-Task Learning as a Bargaining Game
In Multi-task learning (MTL), a joint model is trained to simultaneously make
predictions for several tasks. Joint training reduces computation costs and
improves data efficiency; however, since the gradients of these different tasks
may conflict, training a joint model for MTL often yields lower performance
than its corresponding single-task counterparts. A common method for
alleviating this issue is to combine per-task gradients into a joint update
direction using a particular heuristic. In this paper, we propose viewing the
gradients combination step as a bargaining game, where tasks negotiate to reach
an agreement on a joint direction of parameter update. Under certain
assumptions, the bargaining problem has a unique solution, known as the Nash
Bargaining Solution, which we propose to use as a principled approach to
multi-task learning. We describe a new MTL optimization procedure, Nash-MTL,
and derive theoretical guarantees for its convergence. Empirically, we show
that Nash-MTL achieves state-of-the-art results on multiple MTL benchmarks in
various domains.Comment: ICML 202