Search CORE

8 research outputs found

Self-Supervised Learning for Domain Adaptation on Point-Clouds

Author: Achituve Idan
Chechik Gal
Maron Haggai
Publication venue
Publication date: 29/06/2020
Field of study

Self-supervised learning (SSL) allows to learn useful representations from unlabeled data and has been applied effectively for domain adaptation (DA) on images. It is still unknown if and how it can be leveraged for domain adaptation for 3D perception. Here we describe the first study of SSL for DA on point clouds. We introduce a new family of pretext tasks, \textit{Deformation Reconstruction}, motivated by the deformations encountered in sim-to-real transformations. The key idea is to deform regions of the input shape and use a neural network to reconstruct them. We design three types of shape deformation methods: (1) \textit{Volume-based:} shape deformation based on proximity in the input space; (2) \textit{Feature-based:} deforming regions in the shape that are semantically similar; and (3) \textit{Sampling-based:} shape deformation based on three simple sampling schemes. As a separate contribution, we also develop a new method based on the Mixup training procedure for point-clouds. Evaluations on six domain adaptations across synthetic and real furniture data, demonstrate large improvement over previous work

arXiv.org e-Print Archive

De-Confusing Pseudo-Labels in Source-Free Domain Adaptation

Author: Achituve Idan
Diamant Idit
Goldberger Jacob
Netzer Arnon
Rosenfeld Amir
Publication venue
Publication date: 13/03/2024
Field of study

Source-free domain adaptation (SFDA) aims to adapt a source-trained model to an unlabeled target domain without access to the source data. SFDA has attracted growing attention in recent years, where existing approaches focus on self-training that usually includes pseudo-labeling techniques. In this paper, we introduce a novel noise-learning approach tailored to address noise distribution in domain adaptation settings and learn to de-confuse the pseudo-labels. More specifically, we learn a noise transition matrix of the pseudo-labels to capture the label corruption of each class and learn the underlying true label distribution. Estimating the noise transition matrix enables a better true class-posterior estimation, resulting in better prediction accuracy. We demonstrate the effectiveness of our approach when combined with several SFDA methods: SHOT, SHOT++, and AaD. We obtain state-of-the-art results on three domain adaptation datasets: VisDA, DomainNet, and OfficeHome

arXiv.org e-Print Archive

Equivariant Architectures for Learning in Deep Weight Spaces

Author: Achituve Idan
Chechik Gal
Fetaya Ethan
Maron Haggai
Navon Aviv
Shamsian Aviv
Publication venue
Publication date: 31/05/2023
Field of study

Designing machine learning architectures for processing neural networks in their raw weight matrix form is a newly introduced research direction. Unfortunately, the unique symmetry structure of deep weight spaces makes this design very challenging. If successful, such architectures would be capable of performing a wide range of intriguing tasks, from adapting a pre-trained network to a new domain to editing objects represented as functions (INRs or NeRFs). As a first step towards this goal, we present here a novel network architecture for learning in deep weight spaces. It takes as input a concatenation of weights and biases of a pre-trained MLP and processes it using a composition of layers that are equivariant to the natural permutation symmetry of the MLP's weights: Changing the order of neurons in intermediate layers of the MLP does not affect the function it represents. We provide a full characterization of all affine equivariant and invariant layers for these symmetries and show how these layers can be implemented using three basic operations: pooling, broadcasting, and fully connected layers applied to the input in an appropriate manner. We demonstrate the effectiveness of our architecture and its advantages over natural baselines in a variety of learning tasks.Comment: ICML 202

arXiv.org e-Print Archive

Data Augmentations in Deep Weight Spaces

Author: Achituve Idan
Burghouts Gertjan J.
Chechik Gal
Fetaya Ethan
Gavves Efstratios
Kofinas Miltiadis
Maron Haggai
Navon Aviv
Shamsian Aviv
Snoek Cees G. M.
Valperga Riccardo
Zhang David W.
Zhang Yan
Publication venue
Publication date: 15/11/2023
Field of study

Learning in weight spaces, where neural networks process the weights of other deep neural networks, has emerged as a promising research direction with applications in various fields, from analyzing and editing neural fields and implicit neural representations, to network pruning and quantization. Recent works designed architectures for effective learning in that space, which takes into account its unique, permutation-equivariant, structure. Unfortunately, so far these architectures suffer from severe overfitting and were shown to benefit from large datasets. This poses a significant challenge because generating data for this learning setup is laborious and time-consuming since each data sample is a full set of network weights that has to be trained. In this paper, we address this difficulty by investigating data augmentations for weight spaces, a set of techniques that enable generating new data examples on the fly without having to train additional input weight space elements. We first review several recently proposed data augmentation schemes %that were proposed recently and divide them into categories. We then introduce a novel augmentation scheme based on the Mixup method. We evaluate the performance of these techniques on existing benchmarks as well as new benchmarks we generate, which can be valuable for future studies.Comment: Accepted to NeurIPS 2023 Workshop on Symmetry and Geometry in Neural Representation

arXiv.org e-Print Archive

Guided Deep Kernel Learning

Author: Achituve Idan
Chechik Gal
Fetaya Ethan
Publication venue
Publication date: 14/05/2023
Field of study

Combining Gaussian processes with the expressive power of deep neural networks is commonly done nowadays through deep kernel learning (DKL). Unfortunately, due to the kernel optimization process, this often results in losing their Bayesian benefits. In this study, we present a novel approach for learning deep kernels by utilizing infinite-width neural networks. We propose to use the Neural Network Gaussian Process (NNGP) model as a guide to the DKL model in the optimization process. Our approach harnesses the reliable uncertainty estimation of the NNGPs to adapt the DKL target confidence when it encounters novel data points. As a result, we get the best of both worlds, we leverage the Bayesian behavior of the NNGP, namely its robustness to overfitting, and accurate uncertainty estimation, while maintaining the generalization abilities, scalability, and flexibility of deep kernels. Empirically, we show on multiple benchmark datasets of varying sizes and dimensionality, that our method is robust to overfitting, has good predictive performance, and provides reliable uncertainty estimations

arXiv.org e-Print Archive

Multi-Task Learning as a Bargaining Game

Author: Achituve Idan
Chechik Gal
Fetaya Ethan
Kawaguchi Kenji
Maron Haggai
Navon Aviv
Shamsian Aviv
Publication venue
Publication date: 08/07/2022
Field of study

In Multi-task learning (MTL), a joint model is trained to simultaneously make predictions for several tasks. Joint training reduces computation costs and improves data efficiency; however, since the gradients of these different tasks may conflict, training a joint model for MTL often yields lower performance than its corresponding single-task counterparts. A common method for alleviating this issue is to combine per-task gradients into a joint update direction using a particular heuristic. In this paper, we propose viewing the gradients combination step as a bargaining game, where tasks negotiate to reach an agreement on a joint direction of parameter update. Under certain assumptions, the bargaining problem has a unique solution, known as the Nash Bargaining Solution, which we propose to use as a principled approach to multi-task learning. We describe a new MTL optimization procedure, Nash-MTL, and derive theoretical guarantees for its convergence. Empirically, we show that Nash-MTL achieves state-of-the-art results on multiple MTL benchmarks in various domains.Comment: ICML 202

arXiv.org e-Print Archive