1,793 research outputs found

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Inspector Gadget: A Data Programming-based Labeling System for Industrial Images

    Full text link
    As machine learning for images becomes democratized in the Software 2.0 era, one of the serious bottlenecks is securing enough labeled data for training. This problem is especially critical in a manufacturing setting where smart factories rely on machine learning for product quality control by analyzing industrial images. Such images are typically large and may only need to be partially analyzed where only a small portion is problematic (e.g., identifying defects on a surface). Since manual labeling these images is expensive, weak supervision is an attractive alternative where the idea is to generate weak labels that are not perfect, but can be produced at scale. Data programming is a recent paradigm in this category where it uses human knowledge in the form of labeling functions and combines them into a generative model. Data programming has been successful in applications based on text or structured data and can also be applied to images usually if one can find a way to convert them into structured data. In this work, we expand the horizon of data programming by directly applying it to images without this conversion, which is a common scenario for industrial applications. We propose Inspector Gadget, an image labeling system that combines crowdsourcing, data augmentation, and data programming to produce weak labels at scale for image classification. We perform experiments on real industrial image datasets and show that Inspector Gadget obtains better performance than other weak-labeling techniques: Snuba, GOGGLES, and self-learning baselines using convolutional neural networks (CNNs) without pre-training.Comment: 10 pages, 11 figure

    Anomaly detection and automatic labeling for solar cell quality inspection based on Generative Adversarial Network

    Full text link
    Quality inspection applications in industry are required to move towards a zero-defect manufacturing scenario, withnon-destructive inspection and traceability of 100 % of produced parts. Developing robust fault detection and classification modelsfrom the start-up of the lines is challenging due to the difficulty in getting enough representative samples of the faulty patternsand the need to manually label them. This work presents a methodology to develop a robust inspection system, targeting thesepeculiarities, in the context of solar cell manufacturing. The methodology is divided into two phases: In the first phase, an anomalydetection model based on a Generative Adversarial Network (GAN) is employed. This model enables the detection and localizationof anomalous patterns within the solar cells from the beginning, using only non-defective samples for training and without anymanual labeling involved. In a second stage, as defective samples arise, the detected anomalies will be used as automaticallygenerated annotations for the supervised training of a Fully Convolutional Network that is capable of detecting multiple types offaults. The experimental results using 1873 EL images of monocrystalline cells show that (a) the anomaly detection scheme can beused to start detecting features with very little available data, (b) the anomaly detection may serve as automatic labeling in order totrain a supervised model, and (c) segmentation and classification results of supervised models trained with automatic labels arecomparable to the ones obtained from the models trained with manual labels.Comment: 20 pages, 10 figures, 6 tables. This article is part of the special issue "Condition Monitoring, Field Inspection and Fault Diagnostic Methods for Photovoltaic Systems" Published in MDPI - Sensors: see https://www.mdpi.com/journal/sensors/special_issues/Condition_Monitoring_Field_Inspection_and_Fault_Diagnostic_Methods_for_Photovoltaic_System

    Informative sample generation using class aware generative adversarial networks for classification of chest Xrays

    Full text link
    Training robust deep learning (DL) systems for disease detection from medical images is challenging due to limited images covering different disease types and severity. The problem is especially acute, where there is a severe class imbalance. We propose an active learning (AL) framework to select most informative samples for training our model using a Bayesian neural network. Informative samples are then used within a novel class aware generative adversarial network (CAGAN) to generate realistic chest xray images for data augmentation by transferring characteristics from one class label to another. Experiments show our proposed AL framework is able to achieve state-of-the-art performance by using about 35%35\% of the full dataset, thus saving significant time and effort over conventional methods
    • …
    corecore