38,965 research outputs found
A convolutional neural network based deep learning methodology for recognition of partial discharge patterns from high voltage cables
It is a great challenge to differentiate partial discharge (PD) induced by different types of insulation defects in high-voltage cables. Some types of PD signals have very similar characteristics and are specifically difficult to differentiate, even for the most experienced specialists. To overcome the challenge, a convolutional neural network (CNN)-based deep learning methodology for PD pattern recognition is presented in this paper. First, PD testing for five types of artificial defects in ethylene-propylene-rubber cables is carried out in high voltage laboratory to generate signals containing PD data. Second, 3500 sets of PD transient pulses are extracted, and then 33 kinds of PD features are established. The third stage applies a CNN to the data; typical CNN architecture and the key factors which affect the CNN-based pattern recognition accuracy are described. Factors discussed include the number of the network layers, convolutional kernel size, activation function, and pooling method. This paper presents a flowchart of the CNN-based PD pattern recognition method and an evaluation with 3500 sets of PD samples. Finally, the CNN-based pattern recognition results are shown and the proposed method is compared with two more traditional analysis methods, i.e., support vector machine (SVM) and back propagation neural network (BPNN). The results show that the proposed CNN method has higher pattern recognition accuracy than SVM and BPNN, and that the novel method is especially effective for PD type recognition in cases of signals of high similarity, which is applicable for industrial applications
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Empirical studies show that gradient-based methods can learn deep neural
networks (DNNs) with very good generalization performance in the
over-parameterization regime, where DNNs can easily fit a random labeling of
the training data. Very recently, a line of work explains in theory that with
over-parameterization and proper random initialization, gradient-based methods
can find the global minima of the training loss for DNNs. However, existing
generalization error bounds are unable to explain the good generalization
performance of over-parameterized DNNs. The major limitation of most existing
generalization bounds is that they are based on uniform convergence and are
independent of the training algorithm. In this work, we derive an
algorithm-dependent generalization error bound for deep ReLU networks, and show
that under certain assumptions on the data distribution, gradient descent (GD)
with proper random initialization is able to train a sufficiently
over-parameterized DNN to achieve arbitrarily small generalization error. Our
work sheds light on explaining the good generalization performance of
over-parameterized deep neural networks.Comment: 27 pages. This version simplifies the proof and improves the
presentation in Version 3. In AAAI 202
Reducing model bias in a deep learning classifier using domain adversarial neural networks in the MINERvA experiment
We present a simulation-based study using deep convolutional neural networks
(DCNNs) to identify neutrino interaction vertices in the MINERvA passive
targets region, and illustrate the application of domain adversarial neural
networks (DANNs) in this context. DANNs are designed to be trained in one
domain (simulated data) but tested in a second domain (physics data) and
utilize unlabeled data from the second domain so that during training only
features which are unable to discriminate between the domains are promoted.
MINERvA is a neutrino-nucleus scattering experiment using the NuMI beamline at
Fermilab. -dependent cross sections are an important part of the physics
program, and these measurements require vertex finding in complicated events.
To illustrate the impact of the DANN we used a modified set of simulation in
place of physics data during the training of the DANN and then used the label
of the modified simulation during the evaluation of the DANN. We find that deep
learning based methods offer significant advantages over our prior track-based
reconstruction for the task of vertex finding, and that DANNs are able to
improve the performance of deep networks by leveraging available unlabeled data
and by mitigating network performance degradation rooted in biases in the
physics models used for training.Comment: 41 page
- …