Search CORE

105 research outputs found

Hierarchical Metric Learning for Optical Remote Sensing Scene Categorization

Author: Banerjee Biplab
Goel Akashdeep
Pizurica Aleksandra
Publication venue
Publication date: 01/08/2018
Field of study

We address the problem of scene classification from optical remote sensing (RS) images based on the paradigm of hierarchical metric learning. Ideally, supervised metric learning strategies learn a projection from a set of training data points so as to minimize intra-class variance while maximizing inter-class separability to the class label space. However, standard metric learning techniques do not incorporate the class interaction information in learning the transformation matrix, which is often considered to be a bottleneck while dealing with fine-grained visual categories. As a remedy, we propose to organize the classes in a hierarchical fashion by exploring their visual similarities and subsequently learn separate distance metric transformations for the classes present at the non-leaf nodes of the tree. We employ an iterative max-margin clustering strategy to obtain the hierarchical organization of the classes. Experiment results obtained on the large-scale NWPU-RESISC45 and the popular UC-Merced datasets demonstrate the efficacy of the proposed hierarchical metric learning based RS scene recognition strategy in comparison to the standard approaches.Comment: Undergoing revision in GRS

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Information Seeking Behavior of Research Scholars of Vidyasagar University, West Bengal

Author: BANERJEE BIPLAB
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 15/05/2019
Field of study

The main objective of the study is to investigate the information seeking behavior of the research scholars of Vidyasagar University (VU), West Bengal. Besides, the study also intended to identify their information needs and awareness regarding the library services rendered by the central library of the university. Required data was collected from 100 researchers of the university through a structured questionnaire. Findings indicate that guidance in the use of library resources and services is necessary to help researchers to meet some of their information requirements. Most of the researchers avail library weekly (30%) and monthly(45%) and spend maximum of 0-2 hours with the main intention for preparing research, study journals , up-to-date knowledge etc. Most of the researchers depend on VU central library (27%) for information seeking and also collect information from other sources

DigitalCommons@University of Nebraska

CMIR-NET : A Deep Learning Based Model For Cross-Modal Retrieval In Remote Sensing

Author: Banerjee Biplab
Bhattacharya Avik
Chaudhuri Ushasi
Datcu Mihai
Publication venue
Publication date: 24/05/2019
Field of study

We address the problem of cross-modal information retrieval in the domain of remote sensing. In particular, we are interested in two application scenarios: i) cross-modal retrieval between panchromatic (PAN) and multi-spectral imagery, and ii) multi-label image retrieval between very high resolution (VHR) images and speech based label annotations. Notice that these multi-modal retrieval scenarios are more challenging than the traditional uni-modal retrieval approaches given the inherent differences in distributions between the modalities. However, with the growing availability of multi-source remote sensing data and the scarcity of enough semantic annotations, the task of multi-modal retrieval has recently become extremely important. In this regard, we propose a novel deep neural network based architecture which is considered to learn a discriminative shared feature space for all the input modalities, suitable for semantically coherent information retrieval. Extensive experiments are carried out on the benchmark large-scale PAN - multi-spectral DSRSID dataset and the multi-label UC-Merced dataset. Together with the Merced dataset, we generate a corpus of speech signals corresponding to the labels. Superior performance with respect to the current state-of-the-art is observed in all the cases

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Prototypical quadruplet for few-shot class incremental learning

Author: Banerjee Biplab
Chaudhuri Subhasis
Palit Sanchar
Publication venue
Publication date: 09/11/2022
Field of study

Many modern computer vision algorithms suffer from two major bottlenecks: scarcity of data and learning new tasks incrementally. While training the model with new batches of data the model looses it's ability to classify the previous data judiciously which is termed as catastrophic forgetting. Conventional methods have tried to mitigate catastrophic forgetting of the previously learned data while the training at the current session has been compromised. The state-of-the-art generative replay based approaches use complicated structures such as generative adversarial network (GAN) to deal with catastrophic forgetting. Additionally, training a GAN with few samples may lead to instability. In this work, we present a novel method to deal with these two major hurdles. Our method identifies a better embedding space with an improved contrasting loss to make classification more robust. Moreover, our approach is able to retain previously acquired knowledge in the embedding space even when trained with new classes. We update previous session class prototypes while training in such a way that it is able to represent the true class mean. This is of prime importance as our classification rule is based on the nearest class mean classification strategy. We have demonstrated our results by showing that the embedding space remains intact after training the model with new classes. We showed that our method preformed better than the existing state-of-the-art algorithms in terms of accuracy across different sessions

arXiv.org e-Print Archive

GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning

Author: Banerjee Biplab
Jha Ankit
Singha Mainak
Publication venue
Publication date: 22/08/2023
Field of study

Large-scale foundation models, such as CLIP, have demonstrated remarkable success in visual recognition tasks by embedding images in a semantically rich space. Self-supervised learning (SSL) has also shown promise in improving visual recognition by learning invariant features. However, the combination of CLIP with SSL is found to face challenges due to the multi-task framework that blends CLIP's contrastive loss and SSL's loss, including difficulties with loss weighting and inconsistency among different views of images in CLIP's output space. To overcome these challenges, we propose a prompt learning-based model called GOPro, which is a unified framework that ensures similarity between various augmented views of input images in a shared image-text embedding space, using a pair of learnable image and text projectors atop CLIP, to promote invariance and generalizability. To automatically learn such prompts, we leverage the visual content and style primitives extracted from pre-trained CLIP and adapt them to the target task. In addition to CLIP's cross-domain contrastive loss, we introduce a visual contrastive loss and a novel prompt consistency loss, considering the different views of the images. GOPro is trained end-to-end on all three loss objectives, combining the strengths of CLIP and SSL in a principled manner. Empirical evaluations demonstrate that GOPro outperforms the state-of-the-art prompting techniques on three challenging domain generalization tasks across multiple benchmarks by a significant margin. Our code is available at https://github.com/mainaksingha01/GOPro.Comment: Accepted at BMVC 202

arXiv.org e-Print Archive

Pharmacognostical, physiochemical and phytochemical evaluation of leaf, stem and root of orchid Dendrobium ochreatum

Author: Banerjee Janmajoy
Chauhan Neelmani
Dey Biplab K.
Publication venue: 'Innovative Publication'
Publication date: 01/01/2018
Field of study

The present work was aimed to carry out pharmacognosical and phytochemical evaluation of individual root, stem, and leaves of orchid “Dendrobium ochreatum.” The plant was sun dried and was grounded to fine powder using mechanical grinder followed by sieving. The fine powder was collected and subjected to different pharmacognostical studies like fluorescence analysis under uv light at different wavelength. Physiochemical parameters were also evaluated of the dried plant parts like ash values, loss on drying. Each part of plant like root, stem and leaves were separated and subjected to extraction using soxhletion using different polarity solvents i,e hexane, chloroform, ethanol in gradient elution technique. All total nine plant extracts were obtained, phytochemical screening revealed the presence of important phytoconstituents like alkaloid, glycoside, saponins etc whereas only chloroform extracts of stem exhibited the presence of steroid/phytosterol

Neliti

Journal of Applied Pharmaceutical Research (JOAPR)

C-SAW: Self-Supervised Prompt Learning for Image Generalization in Remote Sensing

Author: Banerjee Biplab
Bhattacharya Avigyan
Jha Ankit
Singha Mainak
Publication venue
Publication date: 27/11/2023
Field of study

We focus on domain and class generalization problems in analyzing optical remote sensing images, using the large-scale pre-trained vision-language model (VLM), CLIP. While contrastively trained VLMs show impressive zero-shot generalization performance, their effectiveness is limited when dealing with diverse domains during training and testing. Existing prompt learning techniques overlook the importance of incorporating domain and content information into the prompts, which results in a drop in performance while dealing with such multi-domain data. To address these challenges, we propose a solution that ensures domain-invariant prompt learning while enhancing the expressiveness of visual features. We observe that CLIP's vision encoder struggles to identify contextual image information, particularly when image patches are jumbled up. This issue is especially severe in optical remote sensing images, where land-cover classes exhibit well-defined contextual appearances. To this end, we introduce C-SAW, a method that complements CLIP with a self-supervised loss in the visual space and a novel prompt learning technique that emphasizes both visual domain and content-specific features. We keep the CLIP backbone frozen and introduce a small set of projectors for both the CLIP encoders to train C-SAW contrastively. Experimental results demonstrate the superiority of C-SAW across multiple remote sensing benchmarks and different generalization tasks.Comment: Accepted in ACM ICVGIP 202

arXiv.org e-Print Archive