46 research outputs found

    Learning audio and image representations with bio-inspired trainable feature extractors

    Get PDF
    Recent advancements in pattern recognition and signal processing concern the automatic learning of data representations from labeled training samples. Typical approaches are based on deep learning and convolutional neural networks, which require large amount of labeled training samples. In this work, we propose novel feature extractors that can be used to learn the representation of single prototype samples in an automatic configuration process. We employ the proposed feature extractors in applications of audio and image processing, and show their effectiveness on benchmark data sets.Comment: Accepted for publication in the journal "Eleectronic Letters on Computer Vision and Image Understanding

    Learning sound representations using trainable COPE feature extractors

    Get PDF
    Sound analysis research has mainly been focused on speech and music processing. The deployed methodologies are not suitable for analysis of sounds with varying background noise, in many cases with very low signal-to-noise ratio (SNR). In this paper, we present a method for the detection of patterns of interest in audio signals. We propose novel trainable feature extractors, which we call COPE (Combination of Peaks of Energy). The structure of a COPE feature extractor is determined using a single prototype sound pattern in an automatic configuration process, which is a type of representation learning. We construct a set of COPE feature extractors, configured on a number of training patterns. Then we take their responses to build feature vectors that we use in combination with a classifier to detect and classify patterns of interest in audio signals. We carried out experiments on four public data sets: MIVIA audio events, MIVIA road events, ESC-10 and TU Dortmund data sets. The results that we achieved (recognition rate equal to 91.71% on the MIVIA audio events, 94% on the MIVIA road events, 81.25% on the ESC-10 and 94.27% on the TU Dortmund) demonstrate the effectiveness of the proposed method and are higher than the ones obtained by other existing approaches. The COPE feature extractors have high robustness to variations of SNR. Real-time performance is achieved even when the value of a large number of features is computed.Comment: Accepted for publication in Pattern Recognitio

    Learning skeleton representations for human action recognition

    Get PDF
    Automatic interpretation of human actions gained strong interest among researchers in patter recognition and computer vision because of its wide range of applications, such as in social and home robotics, elderly people health care, surveillance, among others. In this paper, we propose a method for recognition of human actions by analysis of skeleton poses. The method that we propose is based on novel trainable feature extractors, which can learn the representation of prototype skeleton examples and can be employed to recognize skeleton poses of interest. We combine the proposed feature extractors with an approach for classification of pose sequences based on string kernels. We carried out experiments on three benchmark data sets (MIVIA-S, MSRSDA and MHAD) and the results that we achieved are comparable or higher than the ones obtained by other existing methods. A further important contribution of this work is the MIVIA-S dataset, that we collected and made publicly available

    Machine vision based system for flower counting in strawberry plants

    Get PDF
    Background: For strawberry production, accurate yield prediction is very important to help growers increase their profit by efficiently managing their harvesting operation and setting their contracts with buyers. Strawberry plants produce flowers and fruits simultaneously throughout the season. Strawberry flowers are white in color with a yellow pollen at the center, which later becomes a fruit. Strawberry yield can be estimated by counting the number of flowers in a field in advance of harvesting. The objective of this project is to count the number of flowers using image processing techniques, create a map of flower counts using gee-tagging and provide farmers with an estimate of the yield in a given area. Methods: Strawberry flowers could be at different stages of maturation during imaging. We pre-process images using edge-preserving smoothing filter to remove noise without removing fine features. The next stage involves segmentation of flowers from the background. Since flowers are brighter than most other components of plants, simple thresholding with segmentation algorithm will produce candidate pixels. Then flower detection will be conducted using traditional feature engineering along with a classifier such as Histogram of Oriented Gradients, Wavelet Transform, Local Binary Patterns, and the Deep Learning based techniques. Results: Once flowers are detected, the number of flowers is counted to provide farmers with an estimate of yield and variability at different locations in the field. Discussions: One of the biggest challenges with outdoor imaging is the variable lighting conditions. We propose a camera mounted autonomous system to go over rows of strawberry plants to capture images with geo-tags. Cameras are positioned to capture images from different angles to capture occluded flowers. Conclusion: A novel image processing method for accurate strawberry yield prediction is proposed by counting the number of flowers from images for efficient crop management

    Feature learning for information-extreme classifier

    Get PDF
    The feature learning algorithm for information-extreme classifier by clustering of Fast Retina Keypoint binary descriptor, calculated for local features, and usage of spatial pyramid kernel for increasing noise immunity and informativeness of feature representation are considered. Proposed a method of parameters optimization for feature extractor and decision rules based on multi-level coarse features coding using information criterion and population-based search algorithm

    Feature learning for information-extreme classifier

    Get PDF
    The feature learning algorithm for information-extreme classifier by clustering of Fast Retina Keypoint binary descriptor, calculated for local features, and usage of spatial pyramid kernel for increasing noise immunity and informativeness of feature representation are considered. Proposed a method of parameters optimization for feature extractor and decision rules based on multi-level coarse features coding using information criterion and population-based search algorithm

    Feature learning for information-extreme classifier

    Get PDF
    The feature learning algorithm for information-extreme classifier by clustering of Fast Retina Keypoint binary descriptor, calculated for local features, and usage of spatial pyramid kernel for increasing noise immunity and informativeness of feature representation are considered. Proposed a method of parameters optimization for feature extractor and decision rules based on multi-level coarse features coding using information criterion and population-based search algorithm

    A data-reuse aware accelerator for large-scale convolutional networks

    Get PDF
    This paper presents a clustered SIMD accelerator template for Convolutional Networks. These networks significantly outperform other methods in detection and classification tasks in the vision domain. Due to the excessive compute and data transfer requirements these applications benefit a lot from a dedicated accelerator. The proposed accelerator reduces memory traffic by loop transformations such as tiling and fusion to merge successive layers. Although fusion can introduce redundant computations it often reduces the data transfer, and therefore can remove performance bottlenecks. The SIMD cluster is mapped to a Xilinx Zynq FPGA, which can achieve 6.4 Gops performance with a small amount of resources. The performance can be scaled by using multiple clusters

    Scaling Deep Learning on GPU and Knights Landing clusters

    Full text link
    The speed of deep neural networks training has become a big bottleneck of deep learning research and development. For example, training GoogleNet by ImageNet dataset on one Nvidia K20 GPU needs 21 days. To speed up the training process, the current deep learning systems heavily rely on the hardware accelerators. However, these accelerators have limited on-chip memory compared with CPUs. To handle large datasets, they need to fetch data from either CPU memory or remote processors. We use both self-hosted Intel Knights Landing (KNL) clusters and multi-GPU clusters as our target platforms. From an algorithm aspect, current distributed machine learning systems are mainly designed for cloud systems. These methods are asynchronous because of the slow network and high fault-tolerance requirement on cloud systems. We focus on Elastic Averaging SGD (EASGD) to design algorithms for HPC clusters. Original EASGD used round-robin method for communication and updating. The communication is ordered by the machine rank ID, which is inefficient on HPC clusters. First, we redesign four efficient algorithms for HPC systems to improve EASGD's poor scaling on clusters. Async EASGD, Async MEASGD, and Hogwild EASGD are faster \textcolor{black}{than} their existing counterparts (Async SGD, Async MSGD, and Hogwild SGD, resp.) in all the comparisons. Finally, we design Sync EASGD, which ties for the best performance among all the methods while being deterministic. In addition to the algorithmic improvements, we use some system-algorithm codesign techniques to scale up the algorithms. By reducing the percentage of communication from 87% to 14%, our Sync EASGD achieves 5.3x speedup over original EASGD on the same platform. We get 91.5% weak scaling efficiency on 4253 KNL cores, which is higher than the state-of-the-art implementation

    New Transfer Learning Approach Based on a CNN for Fault Diagnosis

    Get PDF
    Induction motors operate in difficult environments in the industry. Monitoring the performance of motors in such circumstances is significant, which can provide a reliable operation system. This paper intends to develop a new model for fault diagnosis based on the knowledge of transfer learning using the ImageNet dataset. The development of this framework provides a novel technique for the diagnosis of single and multiple induction motor faults. A transfer learning model based on a VGG-19 convolutional neural network (CNN) was implemented, which provided a quick and fast training process with higher accuracy. Thermal images with different induction motor conditions were captured with the help of an FLIR camera and applied as inputs to investigate the proposed model. The implementation of this task involved the use of a VGG-19 CNN-based pre-trained network, which provides autonomous features learning based on minimum human intervention. Next, a dense-connected classifier was applied to predict the true class. The experimental results confirmed the robustness and reliability of the developed technique, which was successfully able to classify the induction motor faults, achieving a classification accuracy of 99.8%. The use of a VGG-19 network allowed the attributes to be automatically extracted and associated with the decision-making part. Furthermore, this model was further compared with other applications based on related topics; it successfully proved its superiority and robustness
    corecore