1,094 research outputs found

    Intelligent human action recognition using an ensemble model of evolving deep networks with swarm-based optimization.

    Get PDF
    Automatic interpretation of human actions from realistic videos attracts increasing research attention owing to its growing demand in real-world deployments such as biometrics, intelligent robotics, and surveillance. In this research, we propose an ensemble model of evolving deep networks comprising Convolutional Neural Networks (CNNs) and bidirectional Long Short-Term Memory (BLSTM) networks for human action recognition. A swarm intelligence (SI)-based algorithm is also proposed for identifying the optimal hyper-parameters of the deep networks. The SI algorithm plays a crucial role for determining the BLSTM network and learning configurations such as the learning and dropout rates and the number of hidden neurons, in order to establish effective deep features that accurately represent the temporal dynamics of human actions. The proposed SI algorithm incorporates hybrid crossover operators implemented by sine, cosine, and tanh functions for multiple elite offspring signal generation, as well as geometric search coefficients extracted from a three-dimensional super-ellipse surface. Moreover, it employs a versatile search process led by the yielded promising offspring solutions to overcome stagnation. Diverse CNN–BLSTM networks with distinctive hyper-parameter settings are devised. An ensemble model is subsequently constructed by aggregating a set of three optimized CNN–BLSTM​ networks based on the average prediction probabilities. Evaluated using several publicly available human action data sets, our evolving ensemble deep networks illustrate statistically significant superiority over those with default and optimal settings identified by other search methods. The proposed SI algorithm also shows great superiority over several other methods for solving diverse high-dimensional unimodal and multimodal optimization functions with artificial landscapes

    Satellite Image Compression Using ROI Based EZW Algorithm

    Get PDF
    In all the fields that make use of the images in a large scale for the image applications there is need for the image compression process in order to minimize the size of the storage. Likewise in the marine field there is use of the images like satellite images for their communication purpose. So to make use of them we are proposing a new image compression technique for the compression of the satellite images by using the Region of Interest (ROI)based on the lossy image technique called the Embedded Zero-Tree Wavelet (EZW) algorithm for the compression. The performance of our method can be evaluated and analyzing the PSNR values of the output images

    Supervised Deep Learning for Content-Aware Image Retargeting with Fourier Convolutions

    Full text link
    Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and cropping the original images. A second challenge is generating different image sizes in inference time. However, regular convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model's ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.Comment: 18 pages, 5 figure

    Deep Learning Methods for Remote Sensing

    Get PDF
    Remote sensing is a field where important physical characteristics of an area are exacted using emitted radiation generally captured by satellite cameras, sensors onboard aerial vehicles, etc. Captured data help researchers develop solutions to sense and detect various characteristics such as forest fires, flooding, changes in urban areas, crop diseases, soil moisture, etc. The recent impressive progress in artificial intelligence (AI) and deep learning has sparked innovations in technologies, algorithms, and approaches and led to results that were unachievable until recently in multiple areas, among them remote sensing. This book consists of sixteen peer-reviewed papers covering new advances in the use of AI for remote sensing

    TempNet -- Temporal Super Resolution of Radar Rainfall Products with Residual CNNs

    Full text link
    The temporal and spatial resolution of rainfall data is crucial for environmental modeling studies in which its variability in space and time is considered as a primary factor. Rainfall products from different remote sensing instruments (e.g., radar, satellite) have different space-time resolutions because of the differences in their sensing capabilities and post-processing methods. In this study, we developed a deep learning approach that augments rainfall data with increased time resolutions to complement relatively lower resolution products. We propose a neural network architecture based on Convolutional Neural Networks (CNNs) to improve the temporal resolution of radar-based rainfall products and compare the proposed model with an optical flow-based interpolation method and CNN-baseline model. The methodology presented in this study could be used for enhancing rainfall maps with better temporal resolution and imputation of missing frames in sequences of 2D rainfall maps to support hydrological and flood forecasting studies

    TempNet – Temporal Super-resolution Of Radar Rainfall Products With Residual CNNs

    Get PDF
    The temporal and spatial resolution of rainfall data is crucial for environmental modeling studies in which its variability in space and time is considered as a primary factor. Rainfall products from different remote sensing instruments (e.g., radar, satellite) have different space-time resolutions because of the differences in their sensing capabilities and post-processing methods. In this study, we developed a deep-learning approach that augments rainfall data with increased time resolutions to complement relatively lower-resolution products. We propose a neural network architecture based on Convolutional Neural Networks (CNNs), namely TempNet, to improve the temporal resolution of radar-based rainfall products and compare the proposed model with an optical flow-based interpolation method and CNN-baseline model. While TempNet achieves a mean absolute error of 0.332 mm/h, comparison methods achieve 0.35 and 0.341, respectively. The methodology presented in this study could be used for enhancing rainfall maps with better temporal resolution and imputation of missing frames in sequences of 2D rainfall maps to support hydrological and flood forecasting studies

    Multi-set canonical correlation analysis for 3D abnormal gait behaviour recognition based on virtual sample generation

    Get PDF
    Small sample dataset and two-dimensional (2D) approach are challenges to vision-based abnormal gait behaviour recognition (AGBR). The lack of three-dimensional (3D) structure of the human body causes 2D based methods to be limited in abnormal gait virtual sample generation (VSG). In this paper, 3D AGBR based on VSG and multi-set canonical correlation analysis (3D-AGRBMCCA) is proposed. First, the unstructured point cloud data of gait are obtained by using a structured light sensor. A 3D parametric body model is then deformed to fit the point cloud data, both in shape and posture. The features of point cloud data are then converted to a high-level structured representation of the body. The parametric body model is used for VSG based on the estimated body pose and shape data. Symmetry virtual samples, pose-perturbation virtual samples and various body-shape virtual samples with multi-views are generated to extend the training samples. The spatial-temporal features of the abnormal gait behaviour from different views, body pose and shape parameters are then extracted by convolutional neural network based Long Short-Term Memory model network. These are projected onto a uniform pattern space using deep learning based multi-set canonical correlation analysis. Experiments on four publicly available datasets show the proposed system performs well under various conditions
    • …
    corecore