330 research outputs found
Project SEMACODE : a scale-invariant object recognition system for content-based queries in image databases
For the efficient management of large image databases, the automated characterization of images and the usage of that characterization for searching and ordering tasks is highly desirable. The purpose of the project SEMACODE is to combine the still unsolved problem of content-oriented characterization of images with scale-invariant object recognition and modelbased compression methods. To achieve this goal, existing techniques as well as new concepts related to pattern matching, image encoding, and image compression are examined. The resulting methods are integrated in a common framework with the aid of a content-oriented conception. For the application, an image database at the library of the university of Frankfurt/Main (StUB; about 60000 images), the required operations are developed. The search and query interfaces are defined in close cooperation with the StUB project “Digitized Colonial Picture Library”. This report describes the fundamentals and first results of the image encoding and object recognition algorithms developed within the scope of the project
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational power, both during the training and inference time. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute-and energy-hungry. In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL execution arises. This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance designs. This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking, explaining how to assess the quality of different networks and hardware systems designed for them
Neuromorphic Visual Scene Understanding with Resonator Networks
Inferring the position of objects and their rigid transformations is still an
open problem in visual scene understanding. Here we propose a neuromorphic
solution that utilizes an efficient factorization network based on three key
concepts: (1) a computational framework based on Vector Symbolic Architectures
(VSA) with complex-valued vectors; (2) the design of Hierarchical Resonator
Networks (HRN) to deal with the non-commutative nature of translation and
rotation in visual scenes, when both are used in combination; (3) the design of
a multi-compartment spiking phasor neuron model for implementing complex-valued
vector binding on neuromorphic hardware. The VSA framework uses vector binding
operations to produce generative image models in which binding acts as the
equivariant operation for geometric transformations. A scene can therefore be
described as a sum of vector products, which in turn can be efficiently
factorized by a resonator network to infer objects and their poses. The HRN
enables the definition of a partitioned architecture in which vector binding is
equivariant for horizontal and vertical translation within one partition and
for rotation and scaling within the other partition. The spiking neuron model
allows mapping the resonator network onto efficient and low-power neuromorphic
hardware. In this work, we demonstrate our approach using synthetic scenes
composed of simple 2D shapes undergoing rigid geometric transformations and
color changes. A companion paper demonstrates this approach in real-world
application scenarios for machine vision and robotics.Comment: 15 pages, 6 figures, minor change
Exploration of Convolutional Neural Network Architectures for Large Region Map Automation
Deep learning semantic segmentation algorithms have provided improved
frameworks for the automated production of Land-Use and Land-Cover (LULC) maps,
which significantly increases the frequency of map generation as well as
consistency of production quality. In this research, a total of 28 different
model variations were examined to improve the accuracy of LULC maps. The
experiments were carried out using Landsat 5/7 or Landsat 8 satellite images
with the North American Land Change Monitoring System labels. The performance
of various CNNs and extension combinations were assessed, where VGGNet with an
output stride of 4, and modified U-Net architecture provided the best results.
Additional expanded analysis of the generated LULC maps was also provided.
Using a deep neural network, this work achieved 92.4% accuracy for 13 LULC
classes within southern Manitoba representing a 15.8% improvement over
published results for the NALCMS. Based on the large regions of interest,
higher radiometric resolution of Landsat 8 data resulted in better overall
accuracies (88.04%) compare to Landsat 5/7 (80.66%) for 16 LULC classes. This
represents an 11.44% and 4.06% increase in overall accuracy compared to
previously published NALCMS results, including larger land area and higher
number of LULC classes incorporated into the models compared to other published
LULC map automation methods
Optimization and Applications of Modern Wireless Networks and Symmetry
Due to the future demands of wireless communications, this book focuses on channel coding, multi-access, network protocol, and the related techniques for IoT/5G. Channel coding is widely used to enhance reliability and spectral efficiency. In particular, low-density parity check (LDPC) codes and polar codes are optimized for next wireless standard. Moreover, advanced network protocol is developed to improve wireless throughput. This invokes a great deal of attention on modern communications
Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery
A robust and fast automatic moving object detection and tracking system is
essential to characterize target object and extract spatial and temporal
information for different functionalities including video surveillance systems,
urban traffic monitoring and navigation, robotic. In this dissertation, I
present a collaborative Spatial Pyramid Context-aware moving object detection
and Tracking system. The proposed visual tracker is composed of one master
tracker that usually relies on visual object features and two auxiliary
trackers based on object temporal motion information that will be called
dynamically to assist master tracker. SPCT utilizes image spatial context at
different level to make the video tracking system resistant to occlusion,
background noise and improve target localization accuracy and robustness. We
chose a pre-selected seven-channel complementary features including RGB color,
intensity and spatial pyramid of HoG to encode object color, shape and spatial
layout information. We exploit integral histogram as building block to meet the
demands of real-time performance. A novel fast algorithm is presented to
accurately evaluate spatially weighted local histograms in constant time
complexity using an extension of the integral histogram method. Different
techniques are explored to efficiently compute integral histogram on GPU
architecture and applied for fast spatio-temporal median computations and 3D
face reconstruction texturing. We proposed a multi-component framework based on
semantic fusion of motion information with projected building footprint map to
significantly reduce the false alarm rate in urban scenes with many tall
structures. The experiments on extensive VOTC2016 benchmark dataset and aerial
video confirm that combining complementary tracking cues in an intelligent
fusion framework enables persistent tracking for Full Motion Video and Wide
Aerial Motion Imagery.Comment: PhD Dissertation (162 pages
DeepSeg: Deep Neural Network Framework for Automatic Brain Tumor Segmentation using Magnetic Resonance FLAIR Images
Purpose: Gliomas are the most common and aggressive type of brain tumors due
to their infiltrative nature and rapid progression. The process of
distinguishing tumor boundaries from healthy cells is still a challenging task
in the clinical routine. Fluid-Attenuated Inversion Recovery (FLAIR) MRI
modality can provide the physician with information about tumor infiltration.
Therefore, this paper proposes a new generic deep learning architecture; namely
DeepSeg for fully automated detection and segmentation of the brain lesion
using FLAIR MRI data.
Methods: The developed DeepSeg is a modular decoupling framework. It consists
of two connected core parts based on an encoding and decoding relationship. The
encoder part is a convolutional neural network (CNN) responsible for spatial
information extraction. The resulting semantic map is inserted into the decoder
part to get the full resolution probability map. Based on modified U-Net
architecture, different CNN models such as Residual Neural Network (ResNet),
Dense Convolutional Network (DenseNet), and NASNet have been utilized in this
study.
Results: The proposed deep learning architectures have been successfully
tested and evaluated on-line based on MRI datasets of Brain Tumor Segmentation
(BraTS 2019) challenge, including s336 cases as training data and 125 cases for
validation data. The dice and Hausdorff distance scores of obtained
segmentation results are about 0.81 to 0.84 and 9.8 to 19.7 correspondingly.
Conclusion: This study showed successful feasibility and comparative
performance of applying different deep learning models in a new DeepSeg
framework for automated brain tumor segmentation in FLAIR MR images. The
proposed DeepSeg is open-source and freely available at
https://github.com/razeineldin/DeepSeg/.Comment: Accepted to International Journal of Computer Assisted Radiology and
Surger
A novel application of Lobatto iiia solver for numerical treatment of mixed convection nanofluidic model
The objective of the current investigation is to examine the influence of variable viscosity and transverse magnetic field on mixed convection fluid model through stretching sheet based on copper and silver nanoparticles by exploiting the strength of numerical computing via Lobatto IIIA solver. The nonlinear partial differential equations are changed into ordinary differential equations by means of similarity transformations procedure. A renewed finite difference based Lobatto IIIA method is incorporated to solve the fluidic system numerically. Vogel's model is considered to observe the influence of variable viscosity and applied oblique magnetic field with mixed convection along with temperature dependent viscosity. Graphical and numerical illustrations are presented to visualize the behavior of different sundry parameters of interest on velocity and temperature. Outcomes reflect that volumetric fraction of nanoparticles causes to increase the thermal conductivity of the fluid and the temperature enhances due to blade type copper nanoparticles. The convergence analysis on the accuracy to solve the problem is investigated viably though the residual errors with different tolerances to prove the worth of the solver. The temperature of the fluid accelerates due the blade type nanoparticles of copper and skin friction coefficient is reduced due to enhancement of Grashof Number
- …