Fault detection and location are important and front-end tasks in assuring the reliability of power electronic circuits. In essence, both tasks can be considered as the classification problem. This paper presents a fast fault classification method for power electronic circuits by using the support vector machine (SVM) as a classifier and the wavelet transform as a feature extraction technique. Using one-against-rest SVM and one-against-one SVM are two general approaches to fault classification in power electronic circuits. However, these methods have a high computational complexity, therefore in this design we employ a directed acyclic graph (DAG) SVM to implement the fault classification. The DAG SVM is close to the one-against-one SVM regarding its classification performance, but it is much faster. Moreover, in the presented approach, the DAG SVM is improved by introducing the method of Knearest neighbours to reduce some computations, so that the classification time can be further reduced. A rectifier and an inverter are demonstrated to prove effectiveness of the presented design.
Introduction
Power electronic circuits (PECs) can be found extensively in industrial, military and residential applications [1] . High thermal and frequent mechanical stresses during the operations can accelerate the failure process of PECs. Once a fault inside a PEC occurs, unplanned electrical device breakdown may be triggered, in some cases associated with a very high cost or even casualties. For reasons of safety, reliability and maintenance, a fault has to be detected and diagnosed as soon as possible after its occurrence [2] . In some PECs with the fault tolerance capability, fault detection and diagnosis are necessary steps [3] .
In essence, fault detection and fault diagnosis fall into the category of fault classification. Fault classification methods of PECs can be classified into three groups: model-based, expert system-based and artificial intelligence (AI)-based ones [4] . Among these methods, the AIbased ones seem to be attractive and interesting, because they have some advantages in comparison with other methods [1] . An AI-based method considers the whole system as a black box, whose inside details are being not relevant. This can avoid the problem of circuit system modelling. Classifiers based on the AI technique, such as the fuzzy inference method [5−7] , have been proved to be effective. Compared with methods based on pure hardware circuits [8] , AI-based methods usually employ algorithms, which can be easy and flexible to transplant and upgrade while keeping unchanged the corresponding hardware .
In the AI applications, the Neural Network (NN) is a good classifier, which has good performance in fault classification of PEC. In [9] , the radial-basis-function (RBF) NN is adopted to perform fault detection of an induction motor drive circuit. A back-propagation NN (BPNN) is presented in [10] to diagnose faults of a three-phase inverter. In this example, the classification accuracy of over 95% is reported. In [11] , a multi-layered perceptron network is employed to classify open faults of a simulated voltage source inverter (VSI). Another study of BPNN application to a multi-level inverter fault classification is described in [2] . In some studies, two or more NNs are integrated to perform the classification task, which can improve the classification performance of a diagnostic system [12−14] . The NN-based method has also some drawbacks. For example, different NN trainings may lead to different classification results. In addition, high-dimensional data can result in a long training process, or even a convergence failure. Focusing on these drawbacks, the NN classifier can be improved with other methods. For example, in [15] , before being input to the classifier, the fault samples are pre-processed with Principal Component Analysis (PCA) and Genetic Algorithm (GA), which can reduce dimensions of the training samples. In [16] , the BPNN structure is optimized to improve its classification performance.
Recently, the applications of Support Vector Machine (SVM) to fault classification of PECs have been reported. The SVM has some excellent characteristics, e.g. it needs less adjustable parameters and can find the global solution easily during training, thus leading to stable classification results. The conventional SVM can create binary classes, and such a classifier is called a binary SVM (BSVM). In [17] , one BSVM is used to detect whether the inverter is faulty, and the other BSVM can localize the faulty power switch (upper or bottom half-bridge). Another application of BSVM to the fault detection of an induction motor drive is described in [18] . Generally, diagnosing a PEC involves a multi-class classification. In the domain of machine learning, a multi-class classifier design for SVM involves two methods [19] . The first method is meant to create a multi-class classification in one step, whereas the second one − to combine several BSVMs to form a multi-class classifier, which has three basic forms: oneagainst-rest SVM, one-against-one SVM and Directed Acyclic Graph (DAG) SVM. In diagnosing a PEC, the one-against-rest SVM and one-against-one SVM have been used. For instance, in [20] and [21] , the one-against-rest SVM classifiers are adopted to perform fault diagnosis of simulated rectifiers. Two examples of diagnosing inverters with one-against-rest SVMs are described in [22] and [23] . The application of one-against-one SVM to the diagnosis of an induction motor drive can be found in [24] . The use of DAG SVM, however, is seldom reported in fault classification of PECs.
According to the experiment results obtained in [19] , both DAG SVM and one-against-one SVM are suitable for practical use, because they can always achieve high accuracy in applications. A one-against-one SVM creates classification with a greater number of computations, so that fault classification is a time-consuming task for this classifier. In this research, we apply a DAG SVM to PEC fault classification and, moreover, we attempt to improve the DAG SVM by employing an additional method. Compared with the conventional DAG SVM and other SVM classifiers, the new method needs less BSVMs to perform fault classification and, accordingly, the classification time can be shortened. Also, the classification accuracy of the presented method is very close to that of the conventional DAG SVM. Hence, the presented method can be considered as an alternative classifier for the DAG SVM. Experiments on a rectifier and an inverter were performed to prove effectiveness of the presented method. For the purpose of comparison, five classifiers were designed and examined regarding their classification accuracy and testing time.
Basic theories cocerning SVM classifier

Support vector machines for binary classification
A standard support vector machine classifier, invented by Vapnik and his colleagues [25] , has a theoretical background of statistical learning theory and executes Structural Risk Minimization (SRM) [26] . It can create a binary classification with excellent performance. The standard binary classifier can create both linear and nonlinear classifications. In the domain of fault detection and diagnosis of PECs, the nonlinear classification seems to be more practical. The nonlinear BSVM adopts a mapping function ( ) ψ ⋅ , which can map data samples from the measurement space to a high-dimensional space. The binary classes can become linearly separable in the high-dimensional space. This principle is expressed in Fig. 1 , where ○ and • represent class I and class II, respectively. In order to implement the BSVM, a margin between samples in the high-dimensional space should be maximized. Assume a data set to be { } i x (i = 1, 2, …, Q, where Q is the number of data samples; i
x is an ith data sample in the measurement space), for Class II). The optimal hyper-plane can be represented with:
where: w is a weight vector of optimal hyper-plane; b is a bias. In order to allow some samples to be misclassified to reduce the effect on the decision boundary position, the slack variables 0 i ξ ≥ are necessary. Hence, by considering the misclassified samples, the samples try to be classified correctly beyond the margin:
Maximizing the margin means minimizing the following quadratic optimization problem:
where C is a penalty parameter for balancing the classification accuracy and complexity of the decision boundary. Solving this optimization equation needs the Lagrange multipliers (LMs)
By removing primal variables, the partial derivatives of L in respect to w and b are used to yield the dual formulation L*:
where: i λ , j λ are LMs of ith and jth data samples, respectively; ( ⋅ ) is an inner product; T is the transpose of vector.
The solution of this optimization problem will generate support vectors (SVs), whose corresponding LMs are 0 i λ > . Let the number of SVs be sv n and considering a kernel function
, the calculation function of BSVM becomes:
where: t is a data sample to be classified;^0 k λ > is an LM of kth SV xk. The kernel function has several forms [27] . In our experiments, the RBF kernel function
, where 0 σ > is a kernel parameter; xi is an ith SV) is considered, because this nonlinear kernel function can always lead to good classification performance.
Two conventional multi-class SVMs for PEC fault classification
A conventional one-against-rest SVM employs the Winner-Takes-All (WTA) rule to implement a pattern classification [28] . This classifier is simple to use in the fault classification of PEC. For N fault classes, N BSVMs are needed. For each training, an th i class (labelled with "−1") is separated from the rest (N−1) classes (labelled with "+1"). Finally, a sample x should be assigned to the fault class whose corresponding decision function has the minimum value:
where ( ) i f t is a decision function of ith BSVM for a sample t. For the one-against-one SVM, altogether N(N−1)/2 BSVMs are constructed. The decision function for the BSVM, which is formed by classes i and j (i ≠ j), can be expressed in the form of:
where: , i j sv n is the number of SVs; ,
y is a label of kth SV;
is a bias of this BSVM.
In the final stage, all decision functions of BSVMs need to vote for the appropriate class. The max-wins strategy is adopted to find a class which wins the maximum votes. However, with the increase of N, the number of computations for this classifier will increase drastically, and thus this method will probably become unsuitable for fast fault classification of PECs.
Presented SVM classifier
Typical Structure of DAG SVM
The classifier employed in our research is a typical DAG SVM [29] , whose training phase is the same as in the one-against-one SVM by constructing (N−1)N/2 BSVMs for N fault classes. In the classification phase, the BSVMs are arranged to form a directed acyclic graph. Each node of the DAG SVM is represented with Bij, indicating a BSVM classifier corresponding to class i and class j (i ≠ j). Except for the root node, each node has one input and two outputs which stand for the possible decision values (left and right branch) of the BSVM. Fig. 2 shows a typical model of DAG SVM with N = 5 (for simplicity, assume five fault classes to be marked: '0', '1', '2', '3' and '4', respectively).
The DAG SVM classifier structure is similar to a pyramid, which can be partitioned into N layers for N fault classes. For instance, in Fig. 2 , the root node is the first layer (containing B04), and the second layer contains two nodes (containing B03 and B14), … , the kth layer (k < N) contains k BSVMs, …, and so on. The final layer contains only leaves, which represent the five separated classes. The classification task is initiated from the first layer and is stopped in the final layer. In each layer (except for the final layer), one BSVM is evaluated to generate the output result, which becomes the input of a BSVM in the next layer. Fig. 2 gives an illustration of such a flow path (in red dashed curve), in which B04→B03→B13→B12→'1' are evaluated one after the other. Hence, for N fault classes, (N−1) BSVMs need to be evaluated for each classification task.
Improvement of DAG SVM
Generally, the DAG SVM is a fast classifier, but it still needs to compute (N−1) decision functions for the BSVMs. A typical DAG SVM always starts from the root node, however in this study we consider changing the starting node of DAG SVM, which will probably reduce the evaluation time of this SVM classifier. For example, in Fig. 2 , if the starting node is initiated from B03, not from the root node, the evaluation flow path will become B03→B13→B12→'1'. In this case, the computation of BSVM decision function at B04 node can be bypassed. The key problem is, how to know it is the node B04 that should be avoided. In other words, how to find a limited set of nodes which participate in the computation.
In the paper there is adopted the method of K-Nearest Neighbours (K-NN) as an auxiliary classifier to find the limited set of nodes. The K-NN classifier is an easy to use, non-parametric method and. It was applied to the fault diagnosis of a generator rotor [30] .
Assume N fault classes, each fault class containing L data samples. The centroid for class j is defined in the measurement space:
where ij x is an ith training sample of class j (j = 1, 2, …, N). The K-NN method selects K closest neighbours basing on Euclidean distances between an unknown sample x and the centroids. K fault classes corresponding to the K closest centroids fall into a limited set, from which the starting node can be chosen. Fig. 3 illustrates the way of finding the starting node (N = 5) for K = 3. In this figure, three steps are implemented.
Step 1 computes the Euclidean distances between x and the centroids, and five distances are obtained. These distances are sorted in the ascending order in Step 2 and the first 3 nearest neighbours (assumed to be '0', '1' and '3') are selected. In Step 3, we can observe that three selected classes, whose corresponding leaves are indicated in the final layer of DAG SVM in Fig. 3 , are derived from the subsidiary DAG, enclosed in a triangle marked by dashed lines. The starting node (i.e. B03) is obtained as the root node of the sub graph.
Another way of obtaining the starting node is the use of the information included in indexes. In this case, the index for each class should be predefined and arranged according to the DAG SVM structure. For example, the index for class '0' is 0; for class '1' − 1; …; and so on. The indices i and j for a starting node Bij correspond to the minimal and maximal values of selected K numbers, respectively. Therefore, in a simple way, the starting node B03 can be obtained.
Classification system design based on SVM
The design of a classification system based on an SVM classifier is similar to that based on a neural network, and the steps of the presented improved DAG SVM (iDAG SVM) system design for PEC are as follows: 1) Feature extraction. The original current or voltage signals are sampled from the available sensors. These signals contain noise or redundant information, so they need to be preprocessed by signal processing techniques, such as PCA [15, 20, 31] , FFT [2, 32] , wavelet transformation (WT) [6, 7, 17] , S-transformation [21] , or Concordia transformation [9, 18] . This step generates feature samples, which can be used in the offline training. 2) Offline training of the SVM. Prior to the training, the feature samples need to be assigned with labels (+1 or −1). In our design, in the BSVM training of one-against-rest SVM, the feature samples corresponding to one class are labelled with −1 and the other faults' samples are labelled with +1. For the BSVM training of one-against-one SVM, features for the ith class are labelled with −1, and the features for the jth class are labelled with +1 (i < j). After training, for each BSVM, the generated SVM parameters are saved. Also, considering the iDAG SVM, the centroid of each class needs to be calculated and saved. 3) Fault classification. Given a new sample, the diagnostic flow is implemented according to Fig. 3 .
Case studies: rectifier and inverter
Simulated rectifier
The first circuit is a three-phase full-bridge rectifier with six uncontrolled diodes and a load resistor Rload, is shown in Fig. 4 . This topology can be used in aerospace power systems and many industrial power electronic converter design applications. This circuit is modelled and simulated with Matlab R2010b-Simulink. In this simulation, the Rload value is set to 550 Ω with 10% tolerance (to simulate the load fluctuation) and the filter capacitor Cf value is set to 10 F: µ the nominal phase frequency and voltage of the input source (Ua, Ub and Uc) are 400 Hz and 23 V rms, respectively. The output voltage Ud on the load is selected as an accessible signal.
In this circuit, open-circuit faults for the diodes are examined. Faulty diodes and their fault codes are listed in Table 1 . {Ti, Tj} (i, j = 1,…,6 and i ≠ j) means that two diodes Ti and Tj are faulty simultaneously. In this simulation, f0, indicating a sound circuit, is regarded as a special class. Hence, twenty two classes are considered. For each fault, this circuit model is simulated 50 times and each time the load value is varied and the corresponding signal Ud is sampled (a sample rate for the simulation is 20 kHz). In this way, altogether 50 samples for each fault can be collected. A randomly selected segment of sample for each fault is shown in Fig. 5 . , T2}  f1  T1  f9  {T2, T5}  f17  {T2, T3}  f2  T2  f10  {T1, T3}  f18  {T3, T4}  f3  T3  f11  {T1, T5}  f19  {T4, T5}  f4  T4  f12  {T2, T4}  f20  {T5, T6}  f5  T5  f13  {T2, T6}  f21  {T6, T1}  f6  T6  f14  {T3, T5}  f7 {T1, T4} f15 {T4, T6} Feature extraction. In this example, the WT method is applied to Ud waveforms of each fault class to extract features. The WT is a useful technique [33] that can be used to decompose the collected data into the time-frequency domain, in which coarse coefficients and detail coefficients can be obtained. A simplified diagram of WT decomposition tree is shown in Fig. 6 . The coarse coefficients in the low frequency band can indicate the outline of waveform. Generally, different waveforms, indicating different fault classes, can have different coarse coefficients. Hence, the coarse coefficients are selected as the fault features in our research. The detail coefficients in the high frequency band, however, are not considered as features, because these coefficients can be easily corrupted by noise. =[nCA , nCA , nCA , nCA , nCA ] E can be extracted. Finally, the feature vector E should be normalized to have zero mean and unity variance. In the machine learning domain, this is a commonly used means which can avoid a large data range. Result. The Matlab codes for all classifiers were run on a P4 personal computer with 2.6 GHz dual CPUs and 2 GB RAM. Our operating system is Windows XP.
The feature set is split into two parts: a training set and a testing set. The training set contains 10 samples of each class, whereas 40 samples of each class are used for testing. A penalty parameter for SVM is set to 100, and σ is varied across a range {1, 2, 4, 8, 16, 32}. With these parameters, each SVM classifier can generate six results; the best result for each SVM is recorded. In this experiment, all SVM classifiers can achieve 100% testing accuracy when = 1. σ
The iDAG SVM needs to confirm the value of K. In this research, we performed an exhaustive offline searching for K, ranging from 2 to 22, to find a suitable inflection point. In each search, the K value was changed, the iDAG SVM was used to evaluate the testing set, and accuracy was recorded. The testing accuracy, as well as the testing time, as functions of K, are shown in Figs. 7a−7b. In Fig. 7a , we can observe that the accuracy curve is simple and any value of K can lead to 100% classification accuracy. Hence, we choose K = 2 as a suitable value, which gives the shortest testing time.
Five classifiers are compared in terms of the classification Accuracy (Acc), training time (TrT) and testing time (TeT) for the testing set. Comparison of SVM classifiers' performance is based on their best results ( = 1 σ ) and shown in Table 2 . The BPNN used in this study is a forward-feed neural network with three-layered structure, whose training parameters are shown in Fig. 8 . Training of this neural classifier was performed with Matlab toolbox for neural networks. Note that the number of hidden neurons is 26, with which a good result can be obtained. Also, the neural classifier was trained for 3 times and the best result was added to Table 2 . The Matlab function to validate the testing set is 'sim'. Fig . 8 . The neural network used in our experiment.
Actual rectifier
An actual rectifier is mainly designed with six discrete power diodes (type: 6A10). A photo of the circuit fault diagnosis system (including a fault setup and a fault data acquisition ) is shown in Fig. 9 . In this system, each diode is connected in series with a relay and an action of the relay, manually controlled with a button, can generate an open fault of diode. A single fault can be generated by pushing one button, whereas double faults need pushing two buttons simultaneously. We used this system to collect 50 samples for each fault. Collecting a fault sample, each time we changed the load randomly within a 10% tolerance. The fault signals were collected with Handyscope HS4 (12-bit ADC inside), and a sample rate for this data collection device was set to 20 kHz, which was consistent with the setup during the simulation.
For each fault, a segment of one sample was randomly selected, as shown in Fig. 10 . Subsequently, the WT was applied to these signals for feature extraction. The starting point (closely related to phase information) of the signal is important for WT, and we found the first starting point of the signal by zero-crossing detection of the input power source waveform Uab. The basic principle of finding a starting point of Ud is shown in Fig. 11 , which presents the waveforms of Ud and Uab in a sound circuit condition. In this research, the steps of feature extraction and classifiers' design were identical to those of the simulated rectifier. Fig. 12 presents the curves of accuracy and testing time for iDAG SVM. Comparison of the classifiers' performance with their best results is shown in Table 3 for σ = 4. For the actual rectifier circuit we also performed another experiment, in which both the input power (including amplitude and frequency) and load were randomly fluctuated. The tolerances of input power amplitude, input frequency and load were set to 5%, 5% and 10%, respectively. This experiment aimed to examine the classifier performance in a complicated operation environment. The curves of finding a good result in the iDAG SVM classifier design are shown in Fig. 13 and, for = 1 σ , several classifiers can achieve the best results, which are listed in Table 4 . 
Inverter
The third PEC, with its structure shown in Fig. 14 , is an actual three-phase inverter which drives a motor with a symmetric structure. The motor is a 50 W brushless DC motor (BLDCM) with 3-phase and 5-pair poles, and the BLDCM shaft is coupled with an electric fan, running at a rate of 800 rpm (~5% fluctuation). This system is used for cooling in industry applications. Although the used inverter has a low driving power, the considered fault classification algorithms can be extended to inverters with a higher power in a straightforward way.
In this inverter, the MOSFETs are driven with a square wave pulse width modulation (PWM), and the voltage value is Vdc = 48 V. We consider a single switch open fault for this drive circuit, a total of 7 faults need to be classified. In this case, the fault code for switch Ti is fi (i = 1, …, 6), and f0 denotes the sound condition of this circuit. In this example, three phase currents (ia, ib and ic) need to be collected synchronously. In this experiment, forty samples for each fault pattern were collected based on the experiment platform, in which an open fault of a power switch could be set by an emulator of the drive circuit controller.
Feature extraction. The presented feature extraction algorithm needs two steps.
Step 1: The WT is adopted to reduce the effect of noise. 'Haar' wavelet is selected as the mother function to decompose the currents' signals into coarse coefficients ( ( ), ( ), ( ) w w w a b c i k i k i k ) and detail coefficients in layer 3. The detail coefficients were discarded in this design.
In order to reduce the effect of load, with reference to [17] , normalization of wavelet coefficients can be considered: In the Fig. 15 , we can observe that, after decomposition of 3 layers, the outlines of current waveforms are clear. Also, we can find that the waveforms for each fault are different from others, and hence they can be used for subsequent fault classification.
Step 2: For the AC motor system with a symmetric structure, the Concordia transform can be used to calculate a 2-dimensional current trajectory (ˆ, i i α β ) in the α-β frame:
ˆˆˆ ( Figure 16 illustrates selected trajectories for (ˆ, i i α β ) under faults. Note that the numbers on the axes have no units because they are ratios from the formulae (13) and (14) . We can observe that these trajectories are different from each other mainly in terms of geometry. Hence, we can consider extracting simple centroid features from these trajectories.
The centroid features can be extracted from a closed trajectory [34, 35] :
In order to obtain a completely closed trajectory, M should be the number of data points from at least one cycle of waveform. A 2-dimensional feature vectorˆ=[ , ]
C C α β E can describe the centroid of trajectory regarding its radius size. The 2-dimensional centroid features are sufficient to discern seven faults and can lead to good classification results.
Results for inverter. Fourteen feature samples were used as the training set, whereas the other 26 samples were used as the validation set. By confining C to 100, the one-against-rest SVM can achieve the best result for = 1 σ , whereas the one-against-one SVM, − for = 2 σ .
For the designed iDAG SVM, the validation set is also used to search for a proper value of K. The searching curves are given in Fig. 17 . In this experiment, different values of K can lead to the same accuracy (96.7%). Hence, K = 2 was directly selected for this iDAG SVM. In the neural classifier design, the number of hidden neurons is set to 7 and the BPNN is trained for several times. The best result of 97.8% is recorded. Good results for these classifiers are shown in Table 5 . when classifying a small number of faults. However, it is not a good choice to classify a large number of faults with conventional neural networks, because these classifiers will need much more hidden neurons to implement a complex training process and so to increase the calculation complexity. Changing a big network into some small-scale networks can solve this problem [32] , but both the structure and parameters of these small networks need to be determined deliberately. Moreover, the neural network classifier exhibits different performance for different trainings. The SVM classifiers can be regarded as alternative solutions for the neural classifiers in the applications to power electronic system diagnosis, because an SVM classifier can achieve very close or even better performance to that of a neural classifier. In addition, a standard SVM classifier can exhibit stable performance for different trainings. Moreover, it needs to tune relatively fewer parameters.
3) The SVM classifier has many forms [36, 37] . In the research, we examined two typical forms in the domain of power electronic circuit fault diagnosis. As a result, the one-against-one SVM is usually used for classification, because it can achieve a high classification accuracy. However, this classifier needs a high computational complexity to implement classification and this drawback limits its usage in some applications, e.g. in power electronic circuit online fault diagnosis, surveillance or even fault-tolerant systems. Hence, the structure of this classifier needs to be improved or rearranged to adapt to fast fault classification. We adopt the DAG SVM as an alternative for this classifier. In our research, the DAG SVM and the one-against-one SVM were compared in terms of classification accuracy and testing time, and we found that the DAG SVM's performance was very close to its counterpart's, but with a significantly lower computational complexity needed. Hence, we believe that the DAG SVM can be used to replace the one-against-one SVM for fault classification of PECs. The iDAG SVM, with the help of nearest neighbours, can further reduce the testing time, whilst maintaining almost unchanged performance. 4) The WT is a good tool for fault feature extraction of power electronic circuits. We achieved good results by using this tool in the experiments. The proper selection of a good mother function and decomposition layers is a difficult problem, and in our research we solved this problem by comparing and evaluating the experiment results with different parameters.
Conclusions
In industrial applications, a fast method of fault diagnosis in power electronic circuits is important because of the requirements of high reliability and fault-tolerance. This paper presents a data-driven method of fault classification in power electronic circuits, and this method is based on the DAG SVM. This classifier can be improved by combining it with the K-NN method. Compared with the conventional one-against-rest SVM and one-against-one SVM, the presented method has a very high implementation speed, because this method is based on the DAG SVM, which needs to compute (N−1) BSVMs for N faults. After the improvement, the number of BSVMs needed by iDAG SVM is less than or equal to (N−2). Hence, among the SVM classifiers, the iDAG SVM has the lowest computational complexity. Also, the iDAG SVM has the classification performance comparable with that of the DAG SVM, if only the parameter K is properly selected. In our research, K was determined on the basis of experimental searching results. In another way, K can be selected with an empirical formula as follows:
With this formula, in many experiments, we found that the iDAG SVM can achieve satisfactory results. Note that the proposed classifier can also be considered as a general method, and this classifier can be easily extended to other fast fault classification applications. The presented method also has some limitations, because it is based on the conventional DAG SVM, which needs to be prearranged. In arranging the DAG SVM structure, the selection of a root node for the DAG SVM is a problem. Different root nodes will probably lead to different performance of the classifier. Hence, the proper selection of a root node should be further studied. In the future, we can consider some available methods in designing pattern classifiers to solve this problem [38−40] .
Finally, the feature extraction is important in the design of a successive classifier In our research we need to select fault features manually, so in the future work, automatic and efficient feature selection methods will be examined.
