261 research outputs found
Squeeze-and-Excitation SqueezeNext: An Efficient DNN for Hardware Deployment
Indiana University-Purdue University Indianapolis (IUPUI)Convolution neural network is being used in field of autonomous driving vehicles or driver assistance systems (ADAS), and has achieved great success. Before the convolution neural network, traditional machine learning algorithms helped the driver assistance systems. Currently, there is a great exploration being done in architectures like MobileNet, SqueezeNext & SqueezeNet. It improved the CNN architectures and made it more suitable to implement on real-time embedded systems.
This thesis proposes an efficient and a compact CNN to ameliorate the performance of existing CNN architectures. The intuition behind this proposed architecture is to supplant convolution layers with a more sophisticated block module and to develop a compact architecture with a competitive accuracy. Further, explores the bottleneck module and squeezenext basic block structure. The state-of-the-art squeezenext baseline architecture is used as a foundation to recreate and propose a high performance squeezenext architecture. The proposed architecture is further trained on the CIFAR-10 dataset from scratch. All the training and testing results are visualized with live loss and accuracy graphs. Focus of this thesis is to make an adaptable and a flexible model for efficient CNN performance which can perform better with the minimum tradeoff between model accuracy, size, and speed. Having a model size of 0.595MB along with accuracy of 92.60% and with a satisfactory training and validating speed of 9 seconds, this model can be deployed on real-time autonomous system platform such as Bluebox 2.0 by NXP
USE-Net: Incorporating Squeeze-and-Excitation blocks into U-Net for prostate zonal segmentation of multi-institutional MRI datasets
Prostate cancer is the most common malignant tumors in men but prostate
Magnetic Resonance Imaging (MRI) analysis remains challenging. Besides whole
prostate gland segmentation, the capability to differentiate between the blurry
boundary of the Central Gland (CG) and Peripheral Zone (PZ) can lead to
differential diagnosis, since tumor's frequency and severity differ in these
regions. To tackle the prostate zonal segmentation task, we propose a novel
Convolutional Neural Network (CNN), called USE-Net, which incorporates
Squeeze-and-Excitation (SE) blocks into U-Net. Especially, the SE blocks are
added after every Encoder (Enc USE-Net) or Encoder-Decoder block (Enc-Dec
USE-Net). This study evaluates the generalization ability of CNN-based
architectures on three T2-weighted MRI datasets, each one consisting of a
different number of patients and heterogeneous image characteristics, collected
by different institutions. The following mixed scheme is used for
training/testing: (i) training on either each individual dataset or multiple
prostate MRI datasets and (ii) testing on all three datasets with all possible
training/testing combinations. USE-Net is compared against three
state-of-the-art CNN-based architectures (i.e., U-Net, pix2pix, and Mixed-Scale
Dense Network), along with a semi-automatic continuous max-flow model. The
results show that training on the union of the datasets generally outperforms
training on each dataset separately, allowing for both intra-/cross-dataset
generalization. Enc USE-Net shows good overall generalization under any
training condition, while Enc-Dec USE-Net remarkably outperforms the other
methods when trained on all datasets. These findings reveal that the SE blocks'
adaptive feature recalibration provides excellent cross-dataset generalization
when testing is performed on samples of the datasets used during training.Comment: 44 pages, 6 figures, Accepted to Neurocomputing, Co-first authors:
Leonardo Rundo and Changhee Ha
RepViT: Revisiting Mobile CNN From ViT Perspective
Recently, lightweight Vision Transformers (ViTs) demonstrate superior
performance and lower latency compared with lightweight Convolutional Neural
Networks (CNNs) on resource-constrained mobile devices. This improvement is
usually attributed to the multi-head self-attention module, which enables the
model to learn global representations. However, the architectural disparities
between lightweight ViTs and lightweight CNNs have not been adequately
examined. In this study, we revisit the efficient design of lightweight CNNs
and emphasize their potential for mobile devices. We incrementally enhance the
mobile-friendliness of a standard lightweight CNN, specifically MobileNetV3, by
integrating the efficient architectural choices of lightweight ViTs. This ends
up with a new family of pure lightweight CNNs, namely RepViT. Extensive
experiments show that RepViT outperforms existing state-of-the-art lightweight
ViTs and exhibits favorable latency in various vision tasks. On ImageNet,
RepViT achieves over 80\% top-1 accuracy with nearly 1ms latency on an iPhone
12, which is the first time for a lightweight model, to the best of our
knowledge. Our largest model, RepViT-M3, obtains 81.4\% accuracy with only
1.3ms latency. The code and trained models are available at
\url{https://github.com/jameslahm/RepViT}.Comment: 9 pages, 7 figure
Modeling Fission Gas Release at the Mesoscale using Multiscale DenseNet Regression with Attention Mechanism and Inception Blocks
Mesoscale simulations of fission gas release (FGR) in nuclear fuel provide a
powerful tool for understanding how microstructure evolution impacts FGR, but
they are computationally intensive. In this study, we present an alternate,
data-driven approach, using deep learning to predict instantaneous FGR flux
from 2D nuclear fuel microstructure images. Four convolutional neural network
(CNN) architectures with multiscale regression are trained and evaluated on
simulated FGR data generated using a hybrid phase field/cluster dynamics model.
All four networks show high predictive power, with values above 98%.
The best performing network combine a Convolutional Block Attention Module
(CBAM) and InceptionNet mechanisms to provide superior accuracy (mean absolute
percentage error of 4.4%), training stability, and robustness on very low
instantaneous FGR flux values.Comment: Submitted at Journal of Nuclear Materials, 20 pages, 10 figures, 3
table
SleepyWheels: An Ensemble Model for Drowsiness Detection leading to Accident Prevention
Around 40 percent of accidents related to driving on highways in India occur
due to the driver falling asleep behind the steering wheel. Several types of
research are ongoing to detect driver drowsiness but they suffer from the
complexity and cost of the models. In this paper, SleepyWheels a revolutionary
method that uses a lightweight neural network in conjunction with facial
landmark identification is proposed to identify driver fatigue in real time.
SleepyWheels is successful in a wide range of test scenarios, including the
lack of facial characteristics while covering the eye or mouth, the drivers
varying skin tones, camera placements, and observational angles. It can work
well when emulated to real time systems. SleepyWheels utilized EfficientNetV2
and a facial landmark detector for identifying drowsiness detection. The model
is trained on a specially created dataset on driver sleepiness and it achieves
an accuracy of 97 percent. The model is lightweight hence it can be further
deployed as a mobile application for various platforms.Comment: 20 page
Machine Learning for Microcontroller-Class Hardware -- A Review
The advancements in machine learning opened a new opportunity to bring
intelligence to the low-end Internet-of-Things nodes such as microcontrollers.
Conventional machine learning deployment has high memory and compute footprint
hindering their direct deployment on ultra resource-constrained
microcontrollers. This paper highlights the unique requirements of enabling
onboard machine learning for microcontroller class devices. Researchers use a
specialized model development workflow for resource-limited applications to
ensure the compute and latency budget is within the device limits while still
maintaining the desired performance. We characterize a closed-loop widely
applicable workflow of machine learning model development for microcontroller
class devices and show that several classes of applications adopt a specific
instance of it. We present both qualitative and numerical insights into
different stages of model development by showcasing several use cases. Finally,
we identify the open research challenges and unsolved questions demanding
careful considerations moving forward.Comment: Accepted for publication at IEEE Sensors Journa
- …