10 research outputs found
Analyzing the Efficacy of an LLM-Only Approach for Image-based Document Question Answering
Recent document question answering models consist of two key components: the
vision encoder, which captures layout and visual elements in images, and a
Large Language Model (LLM) that helps contextualize questions to the image and
supplements them with external world knowledge to generate accurate answers.
However, the relative contributions of the vision encoder and the language
model in these tasks remain unclear. This is especially interesting given the
effectiveness of instruction-tuned LLMs, which exhibit remarkable adaptability
to new tasks. To this end, we explore the following aspects in this work: (1)
The efficacy of an LLM-only approach on document question answering tasks (2)
strategies for serializing textual information within document images and
feeding it directly to an instruction-tuned LLM, thus bypassing the need for an
explicit vision encoder (3) thorough quantitative analysis on the feasibility
of such an approach. Our comprehensive analysis encompasses six diverse
benchmark datasets, utilizing LLMs of varying scales. Our findings reveal that
a strategy exclusively reliant on the LLM yields results that are on par with
or closely approach state-of-the-art performance across a range of datasets. We
posit that this evaluation framework will serve as a guiding resource for
selecting appropriate datasets for future research endeavors that emphasize the
fundamental importance of layout and image content information
Is it an i or an l: Test-time Adaptation of Text Line Recognition Models
Recognizing text lines from images is a challenging problem, especially for
handwritten documents due to large variations in writing styles. While text
line recognition models are generally trained on large corpora of real and
synthetic data, such models can still make frequent mistakes if the handwriting
is inscrutable or the image acquisition process adds corruptions, such as
noise, blur, compression, etc. Writing style is generally quite consistent for
an individual, which can be leveraged to correct mistakes made by such models.
Motivated by this, we introduce the problem of adapting text line recognition
models during test time. We focus on a challenging and realistic setting where,
given only a single test image consisting of multiple text lines, the task is
to adapt the model such that it performs better on the image, without any
labels. We propose an iterative self-training approach that uses feedback from
the language model to update the optical model, with confident self-labels in
each iteration. The confidence measure is based on an augmentation mechanism
that evaluates the divergence of the prediction of the model in a local region.
We perform rigorous evaluation of our method on several benchmark datasets as
well as their corrupted versions. Experimental results on multiple datasets
spanning multiple scripts show that the proposed adaptation method offers an
absolute improvement of up to 8% in character error rate with just a few
iterations of self-training at test time
Private and Efficient Meta-Learning with Low Rank and Sparse Decomposition
Meta-learning is critical for a variety of practical ML systems -- like
personalized recommendations systems -- that are required to generalize to new
tasks despite a small number of task-specific training points. Existing
meta-learning techniques use two complementary approaches of either learning a
low-dimensional representation of points for all tasks, or task-specific
fine-tuning of a global model trained using all the tasks. In this work, we
propose a novel meta-learning framework that combines both the techniques to
enable handling of a large number of data-starved tasks. Our framework models
network weights as a sum of low-rank and sparse matrices. This allows us to
capture information from multiple domains together in the low-rank part while
still allowing task specific personalization using the sparse part. We
instantiate and study the framework in the linear setting, where the problem
reduces to that of estimating the sum of a rank- and a -column sparse
matrix using a small number of linear measurements. We propose an alternating
minimization method with hard thresholding -- AMHT-LRS -- to learn the low-rank
and sparse part effectively and efficiently. For the realizable, Gaussian data
setting, we show that AMHT-LRS indeed solves the problem efficiently with
nearly optimal samples. We extend AMHT-LRS to ensure that it preserves privacy
of each individual user in the dataset, while still ensuring strong
generalization with nearly optimal number of samples. Finally, on multiple
datasets, we demonstrate that the framework allows personalized models to
obtain superior performance in the data-scarce regime.Comment: 97 pages, 3 figure
Study on Machine Learning and Deep Learning Methods for Cancer Detection
Cancer causes death of about million people every year. Cancer is the frequently recognized and is the major reason of death in men and women. Cancer is a group of diseases involving abnormal cell growth which will spread to other parts of the body. Colonography makes use of low dose radiation Computed tomography (CT) scanning to get an internal view of the cancer tumors making use of special x-ray machine to view tumors. Radiologists examine these images to find tumor like structure using computer tools. As CT Colonography image contain noise such as lungs, small intestine, instruments during image capturing. Cancer occurrence can be detected mainly using shape feature; eliminating shapes similar to tumor is challenging. Hence, to tackle above issues, image processing techniques are used by applying deep learning algorithm- Convolution Neural Network (CNN) and the results are compared with classical machine learning algorithm. The analysis is done with classical machine learning algorithms - Random Forest algorithm (RF) and k-nearest neighbour algorithm (KNN) by extracting texture feature - Local binary pattern (LBP) and shape feature - Histogram oriented gradient (HOG) for comparison
Involvement of mitochondrial intrinsic pathway in rhSP-D (recombinant human Surfactant Protein D) induced apoptosis of prostate cancer cells
Surfactant protein D (SP-D), an innate immune molecule, has an indispensable role in host defense and regulation of inflammation. We reported a novel anti-cancer role of a recombinant fragment of human SP-D (rhSP-D) in leukemic and breast tumor cell lines. A recent study revealed correlation of SP-D expression in Prostate cancer tissues with increased Gleason score and tumor volume. In the present study, we elucidated the role of rhSP-D in prostate cancer using LNCaP (androgen dependent), PC3 (androgen independent) cell lines and primary prostate cancer cells. In accordance with our previous finding, rhSP-D induced apoptosis in LNCaP and PC3 cell lines in a time and dose dependent manner. Isolated primary prostate cancer epithelial cells from explant cultures of tissue biopsies of prostate cancer patients were characterised for the presence of Cytokeratin (epithelial cell), CD10 (negative) and CD164 (positive) markers at protein and transcript level. Anti-prostate tumor effect of rhSP-D was established in the isolated primary prostate cancer epithelial cells. Importantly, primary normal prostate epithelial cells treated with similar concentrations of rhSP-D showed no adverse effect on viability. rhSP-D upregulated phospho p53 and transcripts of Bax and reduced Bcl2 transcripts, suggesting p53 mediated apoptosis in LNCaP cells. rhSP-D induced apoptosis in PC3 cells by lowering phospho ERK1/2 levels and increased BAD transcripts, a distinct mechanism of programmed cell death. Increased release of cytochrome c upon rhSP-D confirmed the activation of mitochondrial intrinsic apoptotic pathway in both the cell types. rhSP-D treatment downregulated transcripts of Bcl2 while upregulated PUMA transcripts, suggesting p53 mediated apoptosis primary prostate cancer cells. Also, positive TUNEL assay confirmed induction of apoptosis by rhSP-D in cancer tissue biopsies. Collectively, our findings reveal an integral role of SP-D in immune surveillance against prostate cancer mediated by two distinct mitochondrial apoptotic mechanisms