330 research outputs found

    Project SEMACODE : a scale-invariant object recognition system for content-based queries in image databases

    Get PDF
    For the efficient management of large image databases, the automated characterization of images and the usage of that characterization for searching and ordering tasks is highly desirable. The purpose of the project SEMACODE is to combine the still unsolved problem of content-oriented characterization of images with scale-invariant object recognition and modelbased compression methods. To achieve this goal, existing techniques as well as new concepts related to pattern matching, image encoding, and image compression are examined. The resulting methods are integrated in a common framework with the aid of a content-oriented conception. For the application, an image database at the library of the university of Frankfurt/Main (StUB; about 60000 images), the required operations are developed. The search and query interfaces are defined in close cooperation with the StUB project “Digitized Colonial Picture Library”. This report describes the fundamentals and first results of the image encoding and object recognition algorithms developed within the scope of the project

    Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead

    Get PDF
    Currently, Machine Learning (ML) is becoming ubiquitous in everyday life. Deep Learning (DL) is already present in many applications ranging from computer vision for medicine to autonomous driving of modern cars as well as other sectors in security, healthcare, and finance. However, to achieve impressive performance, these algorithms employ very deep networks, requiring a significant computational power, both during the training and inference time. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute-and energy-hungry. In a scenario where several sophisticated algorithms need to be executed with limited energy and low latency, the need for cost-effective hardware platforms capable of implementing energy-efficient DL execution arises. This paper first introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and Spiking Neural Network (SNN), and then analyzes techniques to produce efficient and high-performance designs. This work summarizes and compares the works for four leading platforms for the execution of algorithms such as CPU, GPU, FPGA and ASIC describing the main solutions of the state-of-the-art, giving much prominence to the last two solutions since they offer greater design flexibility and bear the potential of high energy-efficiency, especially for the inference process. In addition to hardware solutions, this paper discusses some of the important security issues that these DNN and SNN models may have during their execution, and offers a comprehensive section on benchmarking, explaining how to assess the quality of different networks and hardware systems designed for them

    Neuromorphic Visual Scene Understanding with Resonator Networks

    Full text link
    Inferring the position of objects and their rigid transformations is still an open problem in visual scene understanding. Here we propose a neuromorphic solution that utilizes an efficient factorization network based on three key concepts: (1) a computational framework based on Vector Symbolic Architectures (VSA) with complex-valued vectors; (2) the design of Hierarchical Resonator Networks (HRN) to deal with the non-commutative nature of translation and rotation in visual scenes, when both are used in combination; (3) the design of a multi-compartment spiking phasor neuron model for implementing complex-valued vector binding on neuromorphic hardware. The VSA framework uses vector binding operations to produce generative image models in which binding acts as the equivariant operation for geometric transformations. A scene can therefore be described as a sum of vector products, which in turn can be efficiently factorized by a resonator network to infer objects and their poses. The HRN enables the definition of a partitioned architecture in which vector binding is equivariant for horizontal and vertical translation within one partition and for rotation and scaling within the other partition. The spiking neuron model allows mapping the resonator network onto efficient and low-power neuromorphic hardware. In this work, we demonstrate our approach using synthetic scenes composed of simple 2D shapes undergoing rigid geometric transformations and color changes. A companion paper demonstrates this approach in real-world application scenarios for machine vision and robotics.Comment: 15 pages, 6 figures, minor change

    Exploration of Convolutional Neural Network Architectures for Large Region Map Automation

    Full text link
    Deep learning semantic segmentation algorithms have provided improved frameworks for the automated production of Land-Use and Land-Cover (LULC) maps, which significantly increases the frequency of map generation as well as consistency of production quality. In this research, a total of 28 different model variations were examined to improve the accuracy of LULC maps. The experiments were carried out using Landsat 5/7 or Landsat 8 satellite images with the North American Land Change Monitoring System labels. The performance of various CNNs and extension combinations were assessed, where VGGNet with an output stride of 4, and modified U-Net architecture provided the best results. Additional expanded analysis of the generated LULC maps was also provided. Using a deep neural network, this work achieved 92.4% accuracy for 13 LULC classes within southern Manitoba representing a 15.8% improvement over published results for the NALCMS. Based on the large regions of interest, higher radiometric resolution of Landsat 8 data resulted in better overall accuracies (88.04%) compare to Landsat 5/7 (80.66%) for 16 LULC classes. This represents an 11.44% and 4.06% increase in overall accuracy compared to previously published NALCMS results, including larger land area and higher number of LULC classes incorporated into the models compared to other published LULC map automation methods

    Optimization and Applications of Modern Wireless Networks and Symmetry

    Get PDF
    Due to the future demands of wireless communications, this book focuses on channel coding, multi-access, network protocol, and the related techniques for IoT/5G. Channel coding is widely used to enhance reliability and spectral efficiency. In particular, low-density parity check (LDPC) codes and polar codes are optimized for next wireless standard. Moreover, advanced network protocol is developed to improve wireless throughput. This invokes a great deal of attention on modern communications

    Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery

    Get PDF
    A robust and fast automatic moving object detection and tracking system is essential to characterize target object and extract spatial and temporal information for different functionalities including video surveillance systems, urban traffic monitoring and navigation, robotic. In this dissertation, I present a collaborative Spatial Pyramid Context-aware moving object detection and Tracking system. The proposed visual tracker is composed of one master tracker that usually relies on visual object features and two auxiliary trackers based on object temporal motion information that will be called dynamically to assist master tracker. SPCT utilizes image spatial context at different level to make the video tracking system resistant to occlusion, background noise and improve target localization accuracy and robustness. We chose a pre-selected seven-channel complementary features including RGB color, intensity and spatial pyramid of HoG to encode object color, shape and spatial layout information. We exploit integral histogram as building block to meet the demands of real-time performance. A novel fast algorithm is presented to accurately evaluate spatially weighted local histograms in constant time complexity using an extension of the integral histogram method. Different techniques are explored to efficiently compute integral histogram on GPU architecture and applied for fast spatio-temporal median computations and 3D face reconstruction texturing. We proposed a multi-component framework based on semantic fusion of motion information with projected building footprint map to significantly reduce the false alarm rate in urban scenes with many tall structures. The experiments on extensive VOTC2016 benchmark dataset and aerial video confirm that combining complementary tracking cues in an intelligent fusion framework enables persistent tracking for Full Motion Video and Wide Aerial Motion Imagery.Comment: PhD Dissertation (162 pages

    DeepSeg: Deep Neural Network Framework for Automatic Brain Tumor Segmentation using Magnetic Resonance FLAIR Images

    Full text link
    Purpose: Gliomas are the most common and aggressive type of brain tumors due to their infiltrative nature and rapid progression. The process of distinguishing tumor boundaries from healthy cells is still a challenging task in the clinical routine. Fluid-Attenuated Inversion Recovery (FLAIR) MRI modality can provide the physician with information about tumor infiltration. Therefore, this paper proposes a new generic deep learning architecture; namely DeepSeg for fully automated detection and segmentation of the brain lesion using FLAIR MRI data. Methods: The developed DeepSeg is a modular decoupling framework. It consists of two connected core parts based on an encoding and decoding relationship. The encoder part is a convolutional neural network (CNN) responsible for spatial information extraction. The resulting semantic map is inserted into the decoder part to get the full resolution probability map. Based on modified U-Net architecture, different CNN models such as Residual Neural Network (ResNet), Dense Convolutional Network (DenseNet), and NASNet have been utilized in this study. Results: The proposed deep learning architectures have been successfully tested and evaluated on-line based on MRI datasets of Brain Tumor Segmentation (BraTS 2019) challenge, including s336 cases as training data and 125 cases for validation data. The dice and Hausdorff distance scores of obtained segmentation results are about 0.81 to 0.84 and 9.8 to 19.7 correspondingly. Conclusion: This study showed successful feasibility and comparative performance of applying different deep learning models in a new DeepSeg framework for automated brain tumor segmentation in FLAIR MR images. The proposed DeepSeg is open-source and freely available at https://github.com/razeineldin/DeepSeg/.Comment: Accepted to International Journal of Computer Assisted Radiology and Surger

    A novel application of Lobatto iiia solver for numerical treatment of mixed convection nanofluidic model

    Get PDF
    The objective of the current investigation is to examine the influence of variable viscosity and transverse magnetic field on mixed convection fluid model through stretching sheet based on copper and silver nanoparticles by exploiting the strength of numerical computing via Lobatto IIIA solver. The nonlinear partial differential equations are changed into ordinary differential equations by means of similarity transformations procedure. A renewed finite difference based Lobatto IIIA method is incorporated to solve the fluidic system numerically. Vogel's model is considered to observe the influence of variable viscosity and applied oblique magnetic field with mixed convection along with temperature dependent viscosity. Graphical and numerical illustrations are presented to visualize the behavior of different sundry parameters of interest on velocity and temperature. Outcomes reflect that volumetric fraction of nanoparticles causes to increase the thermal conductivity of the fluid and the temperature enhances due to blade type copper nanoparticles. The convergence analysis on the accuracy to solve the problem is investigated viably though the residual errors with different tolerances to prove the worth of the solver. The temperature of the fluid accelerates due the blade type nanoparticles of copper and skin friction coefficient is reduced due to enhancement of Grashof Number
    corecore