1,793 research outputs found
Deep Learning for Audio Signal Processing
Given the recent surge in developments of deep learning, this article
provides a review of the state-of-the-art deep learning techniques for audio
signal processing. Speech, music, and environmental sound processing are
considered side-by-side, in order to point out similarities and differences
between the domains, highlighting general methods, problems, key references,
and potential for cross-fertilization between areas. The dominant feature
representations (in particular, log-mel spectra and raw waveform) and deep
learning models are reviewed, including convolutional neural networks, variants
of the long short-term memory architecture, as well as more audio-specific
neural network models. Subsequently, prominent deep learning application areas
are covered, i.e. audio recognition (automatic speech recognition, music
information retrieval, environmental sound detection, localization and
tracking) and synthesis and transformation (source separation, audio
enhancement, generative models for speech, sound, and music synthesis).
Finally, key issues and future questions regarding deep learning applied to
audio signal processing are identified.Comment: 15 pages, 2 pdf figure
Inspector Gadget: A Data Programming-based Labeling System for Industrial Images
As machine learning for images becomes democratized in the Software 2.0 era,
one of the serious bottlenecks is securing enough labeled data for training.
This problem is especially critical in a manufacturing setting where smart
factories rely on machine learning for product quality control by analyzing
industrial images. Such images are typically large and may only need to be
partially analyzed where only a small portion is problematic (e.g., identifying
defects on a surface). Since manual labeling these images is expensive, weak
supervision is an attractive alternative where the idea is to generate weak
labels that are not perfect, but can be produced at scale. Data programming is
a recent paradigm in this category where it uses human knowledge in the form of
labeling functions and combines them into a generative model. Data programming
has been successful in applications based on text or structured data and can
also be applied to images usually if one can find a way to convert them into
structured data. In this work, we expand the horizon of data programming by
directly applying it to images without this conversion, which is a common
scenario for industrial applications. We propose Inspector Gadget, an image
labeling system that combines crowdsourcing, data augmentation, and data
programming to produce weak labels at scale for image classification. We
perform experiments on real industrial image datasets and show that Inspector
Gadget obtains better performance than other weak-labeling techniques: Snuba,
GOGGLES, and self-learning baselines using convolutional neural networks (CNNs)
without pre-training.Comment: 10 pages, 11 figure
Anomaly detection and automatic labeling for solar cell quality inspection based on Generative Adversarial Network
Quality inspection applications in industry are required to move towards a
zero-defect manufacturing scenario, withnon-destructive inspection and
traceability of 100 % of produced parts. Developing robust fault detection and
classification modelsfrom the start-up of the lines is challenging due to the
difficulty in getting enough representative samples of the faulty patternsand
the need to manually label them. This work presents a methodology to develop a
robust inspection system, targeting thesepeculiarities, in the context of solar
cell manufacturing. The methodology is divided into two phases: In the first
phase, an anomalydetection model based on a Generative Adversarial Network
(GAN) is employed. This model enables the detection and localizationof
anomalous patterns within the solar cells from the beginning, using only
non-defective samples for training and without anymanual labeling involved. In
a second stage, as defective samples arise, the detected anomalies will be used
as automaticallygenerated annotations for the supervised training of a Fully
Convolutional Network that is capable of detecting multiple types offaults. The
experimental results using 1873 EL images of monocrystalline cells show that
(a) the anomaly detection scheme can beused to start detecting features with
very little available data, (b) the anomaly detection may serve as automatic
labeling in order totrain a supervised model, and (c) segmentation and
classification results of supervised models trained with automatic labels
arecomparable to the ones obtained from the models trained with manual labels.Comment: 20 pages, 10 figures, 6 tables. This article is part of the special
issue "Condition Monitoring, Field Inspection and Fault Diagnostic Methods
for Photovoltaic Systems" Published in MDPI - Sensors: see
https://www.mdpi.com/journal/sensors/special_issues/Condition_Monitoring_Field_Inspection_and_Fault_Diagnostic_Methods_for_Photovoltaic_System
Informative sample generation using class aware generative adversarial networks for classification of chest Xrays
Training robust deep learning (DL) systems for disease detection from medical
images is challenging due to limited images covering different disease types
and severity. The problem is especially acute, where there is a severe class
imbalance. We propose an active learning (AL) framework to select most
informative samples for training our model using a Bayesian neural network.
Informative samples are then used within a novel class aware generative
adversarial network (CAGAN) to generate realistic chest xray images for data
augmentation by transferring characteristics from one class label to another.
Experiments show our proposed AL framework is able to achieve state-of-the-art
performance by using about of the full dataset, thus saving significant
time and effort over conventional methods
- …