105 research outputs found
Hierarchical Metric Learning for Optical Remote Sensing Scene Categorization
We address the problem of scene classification from optical remote sensing
(RS) images based on the paradigm of hierarchical metric learning. Ideally,
supervised metric learning strategies learn a projection from a set of training
data points so as to minimize intra-class variance while maximizing inter-class
separability to the class label space. However, standard metric learning
techniques do not incorporate the class interaction information in learning the
transformation matrix, which is often considered to be a bottleneck while
dealing with fine-grained visual categories. As a remedy, we propose to
organize the classes in a hierarchical fashion by exploring their visual
similarities and subsequently learn separate distance metric transformations
for the classes present at the non-leaf nodes of the tree. We employ an
iterative max-margin clustering strategy to obtain the hierarchical
organization of the classes. Experiment results obtained on the large-scale
NWPU-RESISC45 and the popular UC-Merced datasets demonstrate the efficacy of
the proposed hierarchical metric learning based RS scene recognition strategy
in comparison to the standard approaches.Comment: Undergoing revision in GRS
Information Seeking Behavior of Research Scholars of Vidyasagar University, West Bengal
The main objective of the study is to investigate the information seeking behavior of the research scholars of Vidyasagar University (VU), West Bengal. Besides, the study also intended to identify their information needs and awareness regarding the library services rendered by the central library of the university. Required data was collected from 100 researchers of the university through a structured questionnaire. Findings indicate that guidance in the use of library resources and services is necessary to help researchers to meet some of their information requirements. Most of the researchers avail library weekly (30%) and monthly(45%) and spend maximum of 0-2 hours with the main intention for preparing research, study journals , up-to-date knowledge etc. Most of the researchers depend on VU central library (27%) for information seeking and also collect information from other sources
CMIR-NET : A Deep Learning Based Model For Cross-Modal Retrieval In Remote Sensing
We address the problem of cross-modal information retrieval in the domain of
remote sensing. In particular, we are interested in two application scenarios:
i) cross-modal retrieval between panchromatic (PAN) and multi-spectral imagery,
and ii) multi-label image retrieval between very high resolution (VHR) images
and speech based label annotations. Notice that these multi-modal retrieval
scenarios are more challenging than the traditional uni-modal retrieval
approaches given the inherent differences in distributions between the
modalities. However, with the growing availability of multi-source remote
sensing data and the scarcity of enough semantic annotations, the task of
multi-modal retrieval has recently become extremely important. In this regard,
we propose a novel deep neural network based architecture which is considered
to learn a discriminative shared feature space for all the input modalities,
suitable for semantically coherent information retrieval. Extensive experiments
are carried out on the benchmark large-scale PAN - multi-spectral DSRSID
dataset and the multi-label UC-Merced dataset. Together with the Merced
dataset, we generate a corpus of speech signals corresponding to the labels.
Superior performance with respect to the current state-of-the-art is observed
in all the cases
Prototypical quadruplet for few-shot class incremental learning
Many modern computer vision algorithms suffer from two major bottlenecks:
scarcity of data and learning new tasks incrementally. While training the model
with new batches of data the model looses it's ability to classify the previous
data judiciously which is termed as catastrophic forgetting. Conventional
methods have tried to mitigate catastrophic forgetting of the previously
learned data while the training at the current session has been compromised.
The state-of-the-art generative replay based approaches use complicated
structures such as generative adversarial network (GAN) to deal with
catastrophic forgetting. Additionally, training a GAN with few samples may lead
to instability. In this work, we present a novel method to deal with these two
major hurdles. Our method identifies a better embedding space with an improved
contrasting loss to make classification more robust. Moreover, our approach is
able to retain previously acquired knowledge in the embedding space even when
trained with new classes. We update previous session class prototypes while
training in such a way that it is able to represent the true class mean. This
is of prime importance as our classification rule is based on the nearest class
mean classification strategy. We have demonstrated our results by showing that
the embedding space remains intact after training the model with new classes.
We showed that our method preformed better than the existing state-of-the-art
algorithms in terms of accuracy across different sessions
GOPro: Generate and Optimize Prompts in CLIP using Self-Supervised Learning
Large-scale foundation models, such as CLIP, have demonstrated remarkable
success in visual recognition tasks by embedding images in a semantically rich
space. Self-supervised learning (SSL) has also shown promise in improving
visual recognition by learning invariant features. However, the combination of
CLIP with SSL is found to face challenges due to the multi-task framework that
blends CLIP's contrastive loss and SSL's loss, including difficulties with loss
weighting and inconsistency among different views of images in CLIP's output
space. To overcome these challenges, we propose a prompt learning-based model
called GOPro, which is a unified framework that ensures similarity between
various augmented views of input images in a shared image-text embedding space,
using a pair of learnable image and text projectors atop CLIP, to promote
invariance and generalizability. To automatically learn such prompts, we
leverage the visual content and style primitives extracted from pre-trained
CLIP and adapt them to the target task. In addition to CLIP's cross-domain
contrastive loss, we introduce a visual contrastive loss and a novel prompt
consistency loss, considering the different views of the images. GOPro is
trained end-to-end on all three loss objectives, combining the strengths of
CLIP and SSL in a principled manner. Empirical evaluations demonstrate that
GOPro outperforms the state-of-the-art prompting techniques on three
challenging domain generalization tasks across multiple benchmarks by a
significant margin. Our code is available at
https://github.com/mainaksingha01/GOPro.Comment: Accepted at BMVC 202
Pharmacognostical, physiochemical and phytochemical evaluation of leaf, stem and root of orchid Dendrobium ochreatum
The present work was aimed to carry out pharmacognosical and phytochemical evaluation of individual root, stem, and leaves of orchid “Dendrobium ochreatum.” The plant was sun dried and was grounded to fine powder using mechanical grinder followed by sieving. The fine powder was collected and subjected to different pharmacognostical studies like fluorescence analysis under uv light at different wavelength. Physiochemical parameters were also evaluated of the dried plant parts like ash values, loss on drying. Each part of plant like root, stem and leaves were separated and subjected to extraction using soxhletion using different polarity solvents i,e hexane, chloroform, ethanol in gradient elution technique. All total nine plant extracts were obtained, phytochemical screening revealed the presence of important phytoconstituents like alkaloid, glycoside, saponins etc whereas only chloroform extracts of stem exhibited the presence of steroid/phytosterol
C-SAW: Self-Supervised Prompt Learning for Image Generalization in Remote Sensing
We focus on domain and class generalization problems in analyzing optical
remote sensing images, using the large-scale pre-trained vision-language model
(VLM), CLIP. While contrastively trained VLMs show impressive zero-shot
generalization performance, their effectiveness is limited when dealing with
diverse domains during training and testing. Existing prompt learning
techniques overlook the importance of incorporating domain and content
information into the prompts, which results in a drop in performance while
dealing with such multi-domain data. To address these challenges, we propose a
solution that ensures domain-invariant prompt learning while enhancing the
expressiveness of visual features. We observe that CLIP's vision encoder
struggles to identify contextual image information, particularly when image
patches are jumbled up. This issue is especially severe in optical remote
sensing images, where land-cover classes exhibit well-defined contextual
appearances. To this end, we introduce C-SAW, a method that complements CLIP
with a self-supervised loss in the visual space and a novel prompt learning
technique that emphasizes both visual domain and content-specific features. We
keep the CLIP backbone frozen and introduce a small set of projectors for both
the CLIP encoders to train C-SAW contrastively. Experimental results
demonstrate the superiority of C-SAW across multiple remote sensing benchmarks
and different generalization tasks.Comment: Accepted in ACM ICVGIP 202
- …