222,302 research outputs found
SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation
Document layout analysis is a known problem to the documents research
community and has been vastly explored yielding a multitude of solutions
ranging from text mining, and recognition to graph-based representation, visual
feature extraction, etc. However, most of the existing works have ignored the
crucial fact regarding the scarcity of labeled data. With growing internet
connectivity to personal life, an enormous amount of documents had been
available in the public domain and thus making data annotation a tedious task.
We address this challenge using self-supervision and unlike, the few existing
self-supervised document segmentation approaches which use text mining and
textual labels, we use a complete vision-based approach in pre-training without
any ground-truth label or its derivative. Instead, we generate pseudo-layouts
from the document images to pre-train an image encoder to learn the document
object representation and localization in a self-supervised framework before
fine-tuning it with an object detection model. We show that our pipeline sets a
new benchmark in this context and performs at par with the existing methods and
the supervised counterparts, if not outperforms. The code is made publicly
available at: https://github.com/MaitySubhajit/SelfDocSegComment: Accepted at The 17th International Conference on Document Analysis
and Recognition (ICDAR 2023
Deep Learning for Detection and Segmentation in High-Content Microscopy Images
High-content microscopy led to many advances in biology and medicine. This fast emerging technology is transforming cell biology into a big data driven science. Computer vision methods are used to automate the analysis of microscopy image data. In recent years, deep learning became popular and had major success in computer vision. Most of the available methods are developed to process natural images. Compared to natural images, microscopy images pose domain specific challenges such as small training datasets, clustered objects, and class imbalance.
In this thesis, new deep learning methods for object detection and cell segmentation in microscopy images are introduced. For particle detection in fluorescence microscopy images, a deep learning method based on a domain-adapted Deconvolution Network is presented. In addition, a method for mitotic cell detection in heterogeneous histopathology images is proposed, which combines a deep residual network with Hough voting. The method is used for grading of whole-slide histology images of breast carcinoma. Moreover, a method for both particle detection and cell detection based on object centroids is introduced, which is trainable end-to-end. It comprises a novel Centroid Proposal Network, a layer for ensembling detection hypotheses over image scales and anchors, an anchor regularization scheme which favours prior anchors over regressed locations, and an improved algorithm for Non-Maximum Suppression. Furthermore, a novel loss function based on Normalized Mutual Information is proposed which can cope with strong class imbalance and is derived within a Bayesian framework.
For cell segmentation, a deep neural network with increased receptive field to capture rich semantic information is introduced. Moreover, a deep neural network which combines both paradigms of multi-scale feature aggregation of Convolutional Neural Networks and iterative refinement of Recurrent Neural Networks is proposed. To increase the robustness of the training and improve segmentation, a novel focal loss function is presented.
In addition, a framework for black-box hyperparameter optimization for biomedical image analysis pipelines is proposed. The framework has a modular architecture that separates hyperparameter sampling and hyperparameter optimization. A visualization of the loss function based on infimum projections is suggested to obtain further insights into the optimization problem. Also, a transfer learning approach is presented, which uses only one color channel for pre-training and performs fine-tuning on more color channels. Furthermore, an approach for unsupervised domain adaptation for histopathological slides is presented.
Finally, Galaxy Image Analysis is presented, a platform for web-based microscopy image analysis. Galaxy Image Analysis workflows for cell segmentation in cell cultures, particle detection in mice brain tissue, and MALDI/H&E image registration have been developed.
The proposed methods were applied to challenging synthetic as well as real microscopy image data from various microscopy modalities. It turned out that the proposed methods yield state-of-the-art or improved results. The methods were benchmarked in international image analysis challenges and used in various cooperation projects with biomedical researchers
Malicious JavaScript Detection using Statistical Language Model
The Internet has an immense importance in our day to day life, but at the same time, it has become the medium of infecting computers, attacking users, and distributing malicious code. As JavaScript is the principal language of client side pro- gramming, it is frequently used in conducting such attacks. Various approaches have been made to overcome the JavaScript security issues. Some advanced approaches utilize machine learning technology in combination with de-obfuscation and emula- tion. Many methods of analysis incorporate static analysis and dynamic analysis. Our solution is entirely based on static analysis, which avoids unnecessary runtime overhead.
The central objective of this project is to integrate the work done by Eunjin (EJ) Jung et al. on Towards A Robust Detection of Malicious JavaScript (TARDIS) into the web browser via a Firefox add-on and to demonstrate the usability of our add- on in defending against such attacks. TARDIS uses statistical language modeling for an automatic feature extraction and combines it with structural features from an abstract syntax tree [1]. We have developed a Firefox add-on that is capable of extracting JavaScript code from the page visited and classifying the JavaScript code as either malicious or benign. We leverage the bene t of using a pre-compiled training model in JavaScript Object Notation (JSON). JSON is lightweight and does not consume much memory on a user’s machine. Moreover, it stores the data as key-value pairs and easily maps to the data structures used in modern programming languages. The principle advantage of using a pre-compiled training model is better performance. Our model can achieve 98% accuracy on our sample dataset
Benchmarking Self-Supervised Contrastive Learning Methods for Image-based Plant Phenotyping
Image-based plant phenotyping enables the high-throughput measurement of the physical characteristics of plants by combining
one or more imaging technologies with image analysis tools. Over the past decade, deep learning has been widely successful
for image-based tasks like image classification, object detection, image segmentation and object counting. While deep
learning has been applied to image-based plant phenotyping tasks like plant species classification, plant disease detection,
and leaf counting, its application has been limited. Part of the reason for this is that deep learning models tend to rely on large annotated datasets for training, and it can be expensive and time consuming to generate such datasets.
Motivated by the need to leverage unlabelled data, a lot of research effort has recently been directed towards the area of self-supervised learning (SSL). The common theme among various SSL methods is that they derive the supervisory signal
from the data itself, usually by distorting the input in some way and learning features that are invariant to the distortions. Despite the surge of research in this area, there has been a paucity of research applying self-supervised
learning on image-based plant phenotyping tasks, particularly detection and counting tasks. We address this gap by benchmarking two self-supervised learning methods -- MoCo v2 and DenseCL -- on four image-based plant phenotyping tasks
(the downstream tasks): wheat head detection, plant instance detection, wheat spikelet counting and leaf counting. We study the effects of the domain of the pre-training dataset on the transfer performance using four large-scale datasets: ImageNet
(general purpose concepts), iNaturalist 2021 (natural world images), iNaturalist 2021 Plants (plant images) and the TerraByte Field Crop datatset (crop images). To understand the differences between the internal representations of the neural networks trained with the different methods, we applied a representation similarity analysis technique known as orthogonal Procrustes distance. Our results show that (1) Finetuning a model that is pre-trained with an SSL method typically outperforms training
from scratch for a downstream task, (2) The Supervised pre-training method outperforms DenseCL and MoCo v2 for all the
downstream tasks, except for the leaf counting task where DenseCL excels, (3) There is not much difference, both in the downstream performance and the internal representations, between MoCo v2 and DenseCL pre-trained models, (4) Pre-training with the iNaturalist 2021 Plants dataset leads to the best downstream performance more often than other datasets, and
(5) Models pre-trained in a supervised manner learn more dissimilar features towards the last layers compared to models
pre-trained with MoCo v2 or DenseCL. We hope that this benchmark/evaluation study will inspire further studies towards the development of better self-supervised representation learning methods for image-based plant phenotyping tasks
An Analysis of Scale Invariance in Object Detection - SNIP
An analysis of different techniques for recognizing and detecting objects
under extreme scale variation is presented. Scale specific and scale invariant
design of detectors are compared by training them with different configurations
of input data. By evaluating the performance of different network architectures
for classifying small objects on ImageNet, we show that CNNs are not robust to
changes in scale. Based on this analysis, we propose to train and test
detectors on the same scales of an image-pyramid. Since small and large objects
are difficult to recognize at smaller and larger scales respectively, we
present a novel training scheme called Scale Normalization for Image Pyramids
(SNIP) which selectively back-propagates the gradients of object instances of
different sizes as a function of the image scale. On the COCO dataset, our
single model performance is 45.7% and an ensemble of 3 networks obtains an mAP
of 48.3%. We use off-the-shelf ImageNet-1000 pre-trained models and only train
with bounding box supervision. Our submission won the Best Student Entry in the
COCO 2017 challenge. Code will be made available at
\url{http://bit.ly/2yXVg4c}.Comment: CVPR 2018, camera ready versio
- …