844 research outputs found
An Analysis of Scale Invariance in Object Detection - SNIP
An analysis of different techniques for recognizing and detecting objects
under extreme scale variation is presented. Scale specific and scale invariant
design of detectors are compared by training them with different configurations
of input data. By evaluating the performance of different network architectures
for classifying small objects on ImageNet, we show that CNNs are not robust to
changes in scale. Based on this analysis, we propose to train and test
detectors on the same scales of an image-pyramid. Since small and large objects
are difficult to recognize at smaller and larger scales respectively, we
present a novel training scheme called Scale Normalization for Image Pyramids
(SNIP) which selectively back-propagates the gradients of object instances of
different sizes as a function of the image scale. On the COCO dataset, our
single model performance is 45.7% and an ensemble of 3 networks obtains an mAP
of 48.3%. We use off-the-shelf ImageNet-1000 pre-trained models and only train
with bounding box supervision. Our submission won the Best Student Entry in the
COCO 2017 challenge. Code will be made available at
\url{http://bit.ly/2yXVg4c}.Comment: CVPR 2018, camera ready versio
Assessing knee OA severity with CNN attention-based end-to-end architectures
This work proposes a novel end-to-end convolutional neural network (CNN) architecture to automatically quantify the severity of knee osteoarthritis (OA) using X-Ray images, which incorporates trainable attention modules acting as unsupervised fine-grained detectors of the region of interest (ROI). The proposed attention modules can be applied at different levels and scales across any CNN pipeline helping the network to learn relevant attention patterns over the most informative parts of the image at different resolutions. We test the proposed attention mechanism on existing state-of-the-art CNN architectures as our base models, achieving promising results on the benchmark knee OA datasets from the osteoarthritis initiative (OAI) and multicenter osteoarthritis study (MOST).Postprint (published version
A context based deep learning approach for unbalanced medical image segmentation
Automated medical image segmentation is an important step in many medical
procedures. Recently, deep learning networks have been widely used for various
medical image segmentation tasks, with U-Net and generative adversarial nets
(GANs) being some of the commonly used ones. Foreground-background class
imbalance is a common occurrence in medical images, and U-Net has difficulty in
handling class imbalance because of its cross entropy (CE) objective function.
Similarly, GAN also suffers from class imbalance because the discriminator
looks at the entire image to classify it as real or fake. Since the
discriminator is essentially a deep learning classifier, it is incapable of
correctly identifying minor changes in small structures. To address these
issues, we propose a novel context based CE loss function for U-Net, and a
novel architecture Seg-GLGAN. The context based CE is a linear combination of
CE obtained over the entire image and its region of interest (ROI). In
Seg-GLGAN, we introduce a novel context discriminator to which the entire image
and its ROI are fed as input, thus enforcing local context. We conduct
extensive experiments using two challenging unbalanced datasets: PROMISE12 and
ACDC. We observe that segmentation results obtained from our methods give
better segmentation metrics as compared to various baseline methods.Comment: Accepted in ISBI 202
An Abstraction Model for Semantic Segmentation Algorithms
Semantic segmentation is a process of classifying each pixel in the image.
Due to its advantages, sematic segmentation is used in many tasks such as
cancer detection, robot-assisted surgery, satellite image analysis,
self-driving car control, etc. In this process, accuracy and efficiency are the
two crucial goals for this purpose, and there are several state of the art
neural networks. In each method, by employing different techniques, new
solutions have been presented for increasing efficiency, accuracy, and reducing
the costs. The diversity of the implemented approaches for semantic
segmentation makes it difficult for researches to achieve a comprehensive view
of the field. To offer a comprehensive view, in this paper, an abstraction
model for the task of semantic segmentation is offered. The proposed framework
consists of four general blocks that cover the majority of majority of methods
that have been proposed for semantic segmentation. We also compare different
approaches and consider the importance of each part in the overall performance
of a method.Comment: 6 pages 2 figure
- …