Search CORE

6 research outputs found

TTT-UCDR: Test-time Training for Universal Cross-Domain Retrieval

Author: Biswas Soma
Dutta Titir
Paul Soumava
Saha Aheli
Samanta Abhishek
Publication venue
Publication date: 19/01/2023
Field of study

Image retrieval under generalized test scenarios has gained significant momentum in literature, and the recently proposed protocol of Universal Cross-domain Retrieval is a pioneer in this direction. A common practice in any such generalized classification or retrieval algorithm is to exploit samples from multiple domains during training to learn a domain-invariant representation of data. Such criterion is often restrictive, and thus in this work, for the first time, we explore the challenges associated with generalized retrieval problems under a low-data regime, which is quite relevant in many real-world scenarios. We attempt to make any retrieval model trained on a small cross-domain dataset (containing just two training domains) more generalizable towards any unknown query domain or category by quickly adapting it to the test data during inference. This form of test-time training or adaptation of the retrieval model is explored by means of a number of self-supervision-based loss functions, for example, Rotnet, Jigsaw-puzzle, Barlow twins, etc., in this work. Extensive experiments on multiple large-scale datasets demonstrate the effectiveness of the proposed approach.Comment: 9 pages, 1 figure, 3 table

arXiv.org e-Print Archive

Generalized Zero-Shot Cross-Modal Retrieval

Author: Biswas Soma
Dutta Titir
Publication venue: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication date
Field of study

Cross-modal retrieval is an important research area due to its wide range of applications, and several algorithms have been proposed to address this task. We feel that it is the right time to take a step hack and analyze the current status of research in this area. As new object classes are continuously being discovered over time, it is necessary to design algorithms that can generalize to data from previously unseen classes. Towards that goal, our first contribution is to establish protocols for generalized zero-shot cross-modal retrieval and analyze the generalization ability of the standard cross-modal algorithms. Second, we propose a semantic-aware ranking algorithm that can be used as an add-on to any existing cross-modal approach to improve its performance on both seen and unseen classes. Finally, we propose a modification of the standard evaluation metric (MAP for single-label data and NUCG for multi-label data), which we feel is a more intuitive measure of the cross-modal retrieval performance. Extensive experiments on two single-label and three multi-label crass-modal datasets show the effectiveness of the proposed approach

Open Access Repository of IISc Research Publications

Generalized Zero-Shot Cross-Modal Retrieval

Author: Soma Biswas
Titir Dutta
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Cross-modal retrieval in challenging scenarios using attributes

Author: Biswas Soma
Dutta Titir
Publication venue: ELSEVIER
Publication date
Field of study

Cross-modal retrieval is an important field of research today because of the abundance of multi-media data. In this work, we attempt to address two challenging scenarios that we may encounter in real-life cross-modal retrieval, but which are relatively unexplored in literature. First, due to the ever-increasing number of new categories of data, cross-modal algorithms should be able to generalize to categories which it has not seen during training. Second, the data that is available during testing may be degraded (for example, it has low resolution or noise) as compared to those available during training. Here, we evaluate how these adverse conditions affect the performance of the state-of-the-art cross-modal approaches. We also propose a unified framework that can handle all these diverse and challenging scenarios without any modification. In the proposed approach, the data from different modalities are projected into a common semantic preserving latent space in which semantic relations as given by the classname embeddings (attributes) are preserved. Extensive experiments on diverse cross-modal data including image-text, RGB-depth and comparison with the state-of-the-art approaches show the usefulness of the proposed approach for these challenging scenarios

Open Access Repository of IISc Research Publications

Blind Impulse Estimation and Removal Using Sparse Signal Decomposition Framework for OFDM Systems

Author: AV Oppenheim
Barathram Ramkumar
D Middleton
F Abdelkefi
J Jia
J Rinne
KJ Sangston
KL Blackard
M Aharon
M Ghosh
M. Sabarimalai Manikandan
MS Manikandan
SV Zhidkov
SV Zhidkov
T Hwang
T Starr
Titir Dutta
TY Al-Naffouri
Udit Satija
X Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref