273 research outputs found
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
In this work we address the task of semantic image segmentation with Deep
Learning and make three main contributions that are experimentally shown to
have substantial practical merit. First, we highlight convolution with
upsampled filters, or 'atrous convolution', as a powerful tool in dense
prediction tasks. Atrous convolution allows us to explicitly control the
resolution at which feature responses are computed within Deep Convolutional
Neural Networks. It also allows us to effectively enlarge the field of view of
filters to incorporate larger context without increasing the number of
parameters or the amount of computation. Second, we propose atrous spatial
pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP
probes an incoming convolutional feature layer with filters at multiple
sampling rates and effective fields-of-views, thus capturing objects as well as
image context at multiple scales. Third, we improve the localization of object
boundaries by combining methods from DCNNs and probabilistic graphical models.
The commonly deployed combination of max-pooling and downsampling in DCNNs
achieves invariance but has a toll on localization accuracy. We overcome this
by combining the responses at the final DCNN layer with a fully connected
Conditional Random Field (CRF), which is shown both qualitatively and
quantitatively to improve localization performance. Our proposed "DeepLab"
system sets the new state-of-art at the PASCAL VOC-2012 semantic image
segmentation task, reaching 79.7% mIOU in the test set, and advances the
results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and
Cityscapes. All of our code is made publicly available online.Comment: Accepted by TPAM
LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching
Obtaining large pre-trained models that can be fine-tuned to new tasks with
limited annotated samples has remained an open challenge for medical imaging
data. While pre-trained deep networks on ImageNet and vision-language
foundation models trained on web-scale data are prevailing approaches, their
effectiveness on medical tasks is limited due to the significant domain shift
between natural and medical images. To bridge this gap, we introduce LVM-Med,
the first family of deep networks trained on large-scale medical datasets. We
have collected approximately 1.3 million medical images from 55 publicly
available datasets, covering a large number of organs and modalities such as
CT, MRI, X-ray, and Ultrasound. We benchmark several state-of-the-art
self-supervised algorithms on this dataset and propose a novel self-supervised
contrastive learning algorithm using a graph-matching formulation. The proposed
approach makes three contributions: (i) it integrates prior pair-wise image
similarity metrics based on local and global information; (ii) it captures the
structural constraints of feature embeddings through a loss function
constructed via a combinatorial graph-matching objective; and (iii) it can be
trained efficiently end-to-end using modern gradient-estimation techniques for
black-box solvers. We thoroughly evaluate the proposed LVM-Med on 15 downstream
medical tasks ranging from segmentation and classification to object detection,
and both for the in and out-of-distribution settings. LVM-Med empirically
outperforms a number of state-of-the-art supervised, self-supervised, and
foundation models. For challenging tasks such as Brain Tumor Classification or
Diabetic Retinopathy Grading, LVM-Med improves previous vision-language models
trained on 1 billion masks by 6-7% while using only a ResNet-50.Comment: Update Appendi
Symmetry Shape Prior for Object Segmentation
Symmetry is a useful segmentation cue. We develop an algorithm for segmenting a single symmetric object from the background. Our algorithm is formulated in the principled global optimization framework. Thus we can incorporate all the useful segmentation cues in the global energy function, in addition to the symmetry shape prior. We use the standard cues of regular boundary and coherent object (background) appearance. Our algorithm consists of two stages. The first stage, based on seam carving, detects a set of symmetry axis candidates. Symmetry axis is detected by first finding image “seams” that are aligned with intensity gradients and then matching them based on pairwise symmetry. The second stage formulates symmetric object segmentation in discrete optimization framework. We choose the longest symmetry axis as the object axis. Object symmetry is encouraged through submodular long-range pairwise terms. These pairwise terms are submodular, so optimization with a graph cut is applicable. We demonstrate the effectiveness of symmetry cue on a new symmetric object dataset
On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator
Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise
Implementazione ed ottimizzazione di algoritmi per l'analisi di Biomedical Big Data
Big Data Analytics poses many challenges to the research community who has to handle several computational problems related to the vast amount of data.
An increasing interest involves Biomedical data, aiming to get the so-called personalized medicine, where therapy plans are designed on the specific genotype and phenotype of an individual patient and algorithm optimization plays a key role to this purpose.
In this work we discuss about several topics related to Biomedical Big Data Analytics, with a special attention to numerical issues and algorithmic solutions related to them.
We introduce a novel feature selection algorithm tailored on omics datasets, proving its efficiency on synthetic and real high-throughput genomic datasets.
We tested our algorithm against other state-of-art methods obtaining better or comparable results.
We also implemented and optimized different types of deep learning models, testing their efficiency on biomedical image processing tasks.
Three novel frameworks for deep learning neural network models development are discussed and used to describe the numerical improvements proposed on various topics.
In the first implementation we optimize two Super Resolution models showing their results on NMR images and proving their efficiency in generalization tasks without a retraining.
The second optimization involves a state-of-art Object Detection neural network architecture, obtaining a significant speedup in computational performance.
In the third application we discuss about femur head segmentation problem on CT images using deep learning algorithms.
The last section of this work involves the implementation of a novel biomedical database obtained by the harmonization of multiple data sources, that provides network-like relationships between biomedical entities.
Data related to diseases and other biological relates were mined using web-scraping methods and a novel natural language processing pipeline was designed to maximize the overlap between the different data sources involved in this project
Medical image segmentation and analysis using statistical shape modelling and inter-landmark relationships
The study of anatomical morphology is of great importance to medical imaging, with applications varying from clinical diagnosis to computer-aided surgery. To this end, automated tools are required for accurate extraction of the anatomical boundaries from the image data and detailed interpretation of morphological information. This thesis introduces a novel approach to shape-based analysis of medical images based on Inter- Landmark Descriptors (ILDs). Unlike point coordinates that describe absolute position, these shape variables represent relative configuration of landmarks in the shape. The proposed work is motivated by the inherent difficulties of methods based on landmark coordinates in challenging applications. Through explicit invariance to pose parameters and decomposition of the global shape constraints, this work permits anatomical shape analysis that is resistant to image inhomogeneities and geometrical inconsistencies. Several algorithms are presented to tackle specific image segmentation and analysis problems, including automatic initialisation, optimal feature point search, outlier handling and dynamic abnormality localisation. Detailed validation results are provided based on various cardiovascular magnetic resonance datasets, showing increased robustness and accuracy.Open acces
Deep Learning Approach for Chemistry and Processing History Prediction from Materials Microstructure
Finding the chemical composition and processing history from a microstructure morphology for heterogeneous materials is desired in many applications. While the simulation methods based on physical concepts such as the phase-field method can predict the spatio-temporal evolution of the materials’ microstructure, they are not efficient techniques for predicting processing and chemistry if a specific morphology is desired. In this study, we propose a framework based on a deep learning approach that enables us to predict the chemistry and processing history just by reading the morphological distribution of one element. As a case study, we used a dataset from spinodal decomposition simulation of Fe–Cr–Co alloy created by the phase-field method. The mixed dataset, which includes both images, i.e., the morphology of Fe distribution, and continuous data, i.e., the Fe minimum and maximum concentration in the microstructures, are used as input data, and the spinodal temperature and initial chemical composition are utilized as the output data to train the proposed deep neural network. The proposed convolutional layers were compared with pretrained EfficientNet convolutional layers as transfer learning in microstructure feature extraction. The results show that the trained shallow network is effective for chemistry prediction. However, accurate prediction of processing temperature requires more complex feature extraction from the morphology of the microstructure. We benchmarked the model predictive accuracy for real alloy systems with a Fe–Cr–Co transmission electron microscopy micrograph. The predicted chemistry and heat treatment temperature were in good agreement with the ground truth
Texture and Colour in Image Analysis
Research in colour and texture has experienced major changes in the last few years. This book presents some recent advances in the field, specifically in the theory and applications of colour texture analysis. This volume also features benchmarks, comparative evaluations and reviews
- …