3,507 research outputs found

    Fast feedforward non-parametric deep learning network with automatic feature extraction

    Get PDF
    In this paper, a new type of feedforward non-parametric deep learning network with automatic feature extraction is proposed. The proposed network is based on human-understandable local aggregations extracted directly from the images. There is no need for any feature selection and parameter tuning. The proposed network involves nonlinear transformation, segmentation operations to select the most distinctive features from the training images and builds RBF neurons based on them to perform classification with no weights to train. The design of the proposed network is very efficient (computation and time wise) and produces highly accurate classification results. Moreover, the training process is parallelizable, and the time consumption can be further reduced with more processors involved. Numerical examples demonstrate the high performance and very short training process of the proposed network for different applications

    Fast feedforward non-parametric deep learning network with automatic feature extraction

    Get PDF
    In this paper, a new type of feedforward non-parametric deep learning network with automatic feature extraction is proposed. The proposed network is based on human-understandable local aggregations extracted directly from the images. There is no need for any feature selection and parameter tuning. The proposed network involves nonlinear transformation, segmentation operations to select the most distinctive features from the training images and builds RBF neurons based on them to perform classification with no weights to train. The design of the proposed network is very efficient (computation and time wise) and produces highly accurate classification results. Moreover, the training process is parallelizable, and the time consumption can be further reduced with more processors involved. Numerical examples demonstrate the high performance and very short training process of the proposed network for different applications

    Gait recognition and understanding based on hierarchical temporal memory using 3D gait semantic folding

    Get PDF
    Gait recognition and understanding systems have shown a wide-ranging application prospect. However, their use of unstructured data from image and video has affected their performance, e.g., they are easily influenced by multi-views, occlusion, clothes, and object carrying conditions. This paper addresses these problems using a realistic 3-dimensional (3D) human structural data and sequential pattern learning framework with top-down attention modulating mechanism based on Hierarchical Temporal Memory (HTM). First, an accurate 2-dimensional (2D) to 3D human body pose and shape semantic parameters estimation method is proposed, which exploits the advantages of an instance-level body parsing model and a virtual dressing method. Second, by using gait semantic folding, the estimated body parameters are encoded using a sparse 2D matrix to construct the structural gait semantic image. In order to achieve time-based gait recognition, an HTM Network is constructed to obtain the sequence-level gait sparse distribution representations (SL-GSDRs). A top-down attention mechanism is introduced to deal with various conditions including multi-views by refining the SL-GSDRs, according to prior knowledge. The proposed gait learning model not only aids gait recognition tasks to overcome the difficulties in real application scenarios but also provides the structured gait semantic images for visual cognition. Experimental analyses on CMU MoBo, CASIA B, TUM-IITKGP, and KY4D datasets show a significant performance gain in terms of accuracy and robustness

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure
    corecore