222,302 research outputs found

    SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation

    Full text link
    Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc. However, most of the existing works have ignored the crucial fact regarding the scarcity of labeled data. With growing internet connectivity to personal life, an enormous amount of documents had been available in the public domain and thus making data annotation a tedious task. We address this challenge using self-supervision and unlike, the few existing self-supervised document segmentation approaches which use text mining and textual labels, we use a complete vision-based approach in pre-training without any ground-truth label or its derivative. Instead, we generate pseudo-layouts from the document images to pre-train an image encoder to learn the document object representation and localization in a self-supervised framework before fine-tuning it with an object detection model. We show that our pipeline sets a new benchmark in this context and performs at par with the existing methods and the supervised counterparts, if not outperforms. The code is made publicly available at: https://github.com/MaitySubhajit/SelfDocSegComment: Accepted at The 17th International Conference on Document Analysis and Recognition (ICDAR 2023

    Deep Learning for Detection and Segmentation in High-Content Microscopy Images

    Get PDF
    High-content microscopy led to many advances in biology and medicine. This fast emerging technology is transforming cell biology into a big data driven science. Computer vision methods are used to automate the analysis of microscopy image data. In recent years, deep learning became popular and had major success in computer vision. Most of the available methods are developed to process natural images. Compared to natural images, microscopy images pose domain specific challenges such as small training datasets, clustered objects, and class imbalance. In this thesis, new deep learning methods for object detection and cell segmentation in microscopy images are introduced. For particle detection in fluorescence microscopy images, a deep learning method based on a domain-adapted Deconvolution Network is presented. In addition, a method for mitotic cell detection in heterogeneous histopathology images is proposed, which combines a deep residual network with Hough voting. The method is used for grading of whole-slide histology images of breast carcinoma. Moreover, a method for both particle detection and cell detection based on object centroids is introduced, which is trainable end-to-end. It comprises a novel Centroid Proposal Network, a layer for ensembling detection hypotheses over image scales and anchors, an anchor regularization scheme which favours prior anchors over regressed locations, and an improved algorithm for Non-Maximum Suppression. Furthermore, a novel loss function based on Normalized Mutual Information is proposed which can cope with strong class imbalance and is derived within a Bayesian framework. For cell segmentation, a deep neural network with increased receptive field to capture rich semantic information is introduced. Moreover, a deep neural network which combines both paradigms of multi-scale feature aggregation of Convolutional Neural Networks and iterative refinement of Recurrent Neural Networks is proposed. To increase the robustness of the training and improve segmentation, a novel focal loss function is presented. In addition, a framework for black-box hyperparameter optimization for biomedical image analysis pipelines is proposed. The framework has a modular architecture that separates hyperparameter sampling and hyperparameter optimization. A visualization of the loss function based on infimum projections is suggested to obtain further insights into the optimization problem. Also, a transfer learning approach is presented, which uses only one color channel for pre-training and performs fine-tuning on more color channels. Furthermore, an approach for unsupervised domain adaptation for histopathological slides is presented. Finally, Galaxy Image Analysis is presented, a platform for web-based microscopy image analysis. Galaxy Image Analysis workflows for cell segmentation in cell cultures, particle detection in mice brain tissue, and MALDI/H&E image registration have been developed. The proposed methods were applied to challenging synthetic as well as real microscopy image data from various microscopy modalities. It turned out that the proposed methods yield state-of-the-art or improved results. The methods were benchmarked in international image analysis challenges and used in various cooperation projects with biomedical researchers

    Malicious JavaScript Detection using Statistical Language Model

    Get PDF
    The Internet has an immense importance in our day to day life, but at the same time, it has become the medium of infecting computers, attacking users, and distributing malicious code. As JavaScript is the principal language of client side pro- gramming, it is frequently used in conducting such attacks. Various approaches have been made to overcome the JavaScript security issues. Some advanced approaches utilize machine learning technology in combination with de-obfuscation and emula- tion. Many methods of analysis incorporate static analysis and dynamic analysis. Our solution is entirely based on static analysis, which avoids unnecessary runtime overhead. The central objective of this project is to integrate the work done by Eunjin (EJ) Jung et al. on Towards A Robust Detection of Malicious JavaScript (TARDIS) into the web browser via a Firefox add-on and to demonstrate the usability of our add- on in defending against such attacks. TARDIS uses statistical language modeling for an automatic feature extraction and combines it with structural features from an abstract syntax tree [1]. We have developed a Firefox add-on that is capable of extracting JavaScript code from the page visited and classifying the JavaScript code as either malicious or benign. We leverage the bene t of using a pre-compiled training model in JavaScript Object Notation (JSON). JSON is lightweight and does not consume much memory on a user’s machine. Moreover, it stores the data as key-value pairs and easily maps to the data structures used in modern programming languages. The principle advantage of using a pre-compiled training model is better performance. Our model can achieve 98% accuracy on our sample dataset

    Benchmarking Self-Supervised Contrastive Learning Methods for Image-based Plant Phenotyping

    Get PDF
    Image-based plant phenotyping enables the high-throughput measurement of the physical characteristics of plants by combining one or more imaging technologies with image analysis tools. Over the past decade, deep learning has been widely successful for image-based tasks like image classification, object detection, image segmentation and object counting. While deep learning has been applied to image-based plant phenotyping tasks like plant species classification, plant disease detection, and leaf counting, its application has been limited. Part of the reason for this is that deep learning models tend to rely on large annotated datasets for training, and it can be expensive and time consuming to generate such datasets. Motivated by the need to leverage unlabelled data, a lot of research effort has recently been directed towards the area of self-supervised learning (SSL). The common theme among various SSL methods is that they derive the supervisory signal from the data itself, usually by distorting the input in some way and learning features that are invariant to the distortions. Despite the surge of research in this area, there has been a paucity of research applying self-supervised learning on image-based plant phenotyping tasks, particularly detection and counting tasks. We address this gap by benchmarking two self-supervised learning methods -- MoCo v2 and DenseCL -- on four image-based plant phenotyping tasks (the downstream tasks): wheat head detection, plant instance detection, wheat spikelet counting and leaf counting. We study the effects of the domain of the pre-training dataset on the transfer performance using four large-scale datasets: ImageNet (general purpose concepts), iNaturalist 2021 (natural world images), iNaturalist 2021 Plants (plant images) and the TerraByte Field Crop datatset (crop images). To understand the differences between the internal representations of the neural networks trained with the different methods, we applied a representation similarity analysis technique known as orthogonal Procrustes distance. Our results show that (1) Finetuning a model that is pre-trained with an SSL method typically outperforms training from scratch for a downstream task, (2) The Supervised pre-training method outperforms DenseCL and MoCo v2 for all the downstream tasks, except for the leaf counting task where DenseCL excels, (3) There is not much difference, both in the downstream performance and the internal representations, between MoCo v2 and DenseCL pre-trained models, (4) Pre-training with the iNaturalist 2021 Plants dataset leads to the best downstream performance more often than other datasets, and (5) Models pre-trained in a supervised manner learn more dissimilar features towards the last layers compared to models pre-trained with MoCo v2 or DenseCL. We hope that this benchmark/evaluation study will inspire further studies towards the development of better self-supervised representation learning methods for image-based plant phenotyping tasks

    An Analysis of Scale Invariance in Object Detection - SNIP

    Full text link
    An analysis of different techniques for recognizing and detecting objects under extreme scale variation is presented. Scale specific and scale invariant design of detectors are compared by training them with different configurations of input data. By evaluating the performance of different network architectures for classifying small objects on ImageNet, we show that CNNs are not robust to changes in scale. Based on this analysis, we propose to train and test detectors on the same scales of an image-pyramid. Since small and large objects are difficult to recognize at smaller and larger scales respectively, we present a novel training scheme called Scale Normalization for Image Pyramids (SNIP) which selectively back-propagates the gradients of object instances of different sizes as a function of the image scale. On the COCO dataset, our single model performance is 45.7% and an ensemble of 3 networks obtains an mAP of 48.3%. We use off-the-shelf ImageNet-1000 pre-trained models and only train with bounding box supervision. Our submission won the Best Student Entry in the COCO 2017 challenge. Code will be made available at \url{http://bit.ly/2yXVg4c}.Comment: CVPR 2018, camera ready versio
    • …
    corecore