4,240 research outputs found
DEEP FULLY RESIDUAL CONVOLUTIONAL NEURAL NETWORK FOR SEMANTIC IMAGE SEGMENTATION
Department of Computer Science and EngineeringThe goal of semantic image segmentation is to partition the pixels of an image into semantically meaningful parts and classifying those parts according to a predefined label set. Although object recognition
models achieved remarkable performance recently and they even surpass human???s ability to recognize
objects, but semantic segmentation models are still behind. One of the reason that makes semantic
segmentation relatively a hard problem is the image understanding at pixel level by considering global
context as oppose to object recognition. One other challenge is transferring the knowledge of an object
recognition model for the task of semantic segmentation. In this thesis, we are delineating some of the
main challenges we faced approaching semantic image segmentation with machine learning algorithms.
Our main focus was how we can use deep learning algorithms for this task since they require the
least amount of feature engineering and also it was shown that such models can be applied to large scale
datasets and exhibit remarkable performance. More precisely, we worked on a variation of convolutional
neural networks (CNN) suitable for the semantic segmentation task. We proposed a model called deep
fully residual convolutional networks (DFRCN) to tackle this problem. Utilizing residual learning makes
training of deep models feasible which ultimately leads to having a rich powerful visual representation.
Our model also benefits from skip-connections which ease the propagation of information from the
encoder module to the decoder module. This would enable our model to have less parameters in the
decoder module while it also achieves better performance. We also benchmarked the effective variation
of the proposed model on a semantic segmentation benchmark.
We first make a thorough review of current high-performance models and the problems one might
face when trying to replicate such models which mainly arose from the lack of sufficient provided
information. Then, we describe our own novel method which we called deep fully residual convolutional
network (DFRCN). We showed that our method exhibits state of the art performance on a challenging
benchmark for aerial image segmentation.clos
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
In this work we present In-Place Activated Batch Normalization (InPlace-ABN)
- a novel approach to drastically reduce the training memory footprint of
modern deep neural networks in a computationally efficient way. Our solution
substitutes the conventionally used succession of BatchNorm + Activation layers
with a single plugin layer, hence avoiding invasive framework surgery while
providing straightforward applicability for existing deep learning frameworks.
We obtain memory savings of up to 50% by dropping intermediate results and by
recovering required information during the backward pass through the inversion
of stored forward results, with only minor increase (0.8-2%) in computation
time. Also, we demonstrate how frequently used checkpointing approaches can be
made computationally as efficient as InPlace-ABN. In our experiments on image
classification, we demonstrate on-par results on ImageNet-1k with
state-of-the-art approaches. On the memory-demanding task of semantic
segmentation, we report results for COCO-Stuff, Cityscapes and Mapillary
Vistas, obtaining new state-of-the-art results on the latter without additional
training data but in a single-scale and -model scenario. Code can be found at
https://github.com/mapillary/inplace_abn
Deep Pyramidal Residual Networks
Deep convolutional neural networks (DCNNs) have shown remarkable performance
in image classification tasks in recent years. Generally, deep neural network
architectures are stacks consisting of a large number of convolutional layers,
and they perform downsampling along the spatial dimension via pooling to reduce
memory usage. Concurrently, the feature map dimension (i.e., the number of
channels) is sharply increased at downsampling locations, which is essential to
ensure effective performance because it increases the diversity of high-level
attributes. This also applies to residual networks and is very closely related
to their performance. In this research, instead of sharply increasing the
feature map dimension at units that perform downsampling, we gradually increase
the feature map dimension at all units to involve as many locations as
possible. This design, which is discussed in depth together with our new
insights, has proven to be an effective means of improving generalization
ability. Furthermore, we propose a novel residual unit capable of further
improving the classification accuracy with our new network architecture.
Experiments on benchmark CIFAR-10, CIFAR-100, and ImageNet datasets have shown
that our network architecture has superior generalization ability compared to
the original residual networks. Code is available at
https://github.com/jhkim89/PyramidNet}Comment: Accepted to CVPR 201
Customized CNN Model for Multiple Illness Identification in Rice and Maize
Crop diseases imperil global food security and economies, demanding early detection and effective management. Convolutional Neural Networks (CNNs), particularly in rice and maize leaf disease classification, have gained traction due to their automatic feature extraction capabilities. CNN models eliminate manual feature extraction, enabling precise disease diagnosis based on learned features. Researchers have rapidly advanced these models, achieving promising results. Leaf disease characteristics like color changes, texture variations, and lesion appearance have been identified as useful for automated diagnosis using machine learning. Developing CNN models involves crucial stages: dataset preparation, architecture selection, hyperparameter tuning, and model training and evaluation. Diverse and accurately annotated datasets are pivotal, and appropriate CNN architecture selection, such as ResNet101 and XceptionNet, ensures optimal performance. These architectures' pre-training on vast image datasets enhances feature extraction. Hyperparameter tuning fine-tunes the model, and training and evaluation gauge its precision. CNN models hold potential to enhance rice and maize productivity and global food security by effectively detecting and managing diseases
- …