Abstract-Classification of power-quality (PQ)-related voltage and current waveform disturbances is a key task for power system monitoring. A new method based on the optimized time-frequency representation (TFR) has been proposed in the first paper [1] of this two-paper series. This paper (the second paper) presents a case study of PQ event classification with the proposed method. The classification algorithm has been successfully tested with 860 real world PQ events that cover five classes, achieving a recognition rate of 98%. The algorithm is implemented on a digital signal processor (DSP) based hardware system and optimized according to the DSP architecture to meet the hard real-time constraints. The DSP-based system is capable of processing a five-cycle (83.3 ms) PQ waveform within 11.2 ms. The real-time computing capability of the algorithm has been verified with this result. The scalability of this method is also discussed.
I. INTRODUCTION
T HE first paper [1] of this two-paper series presents the theoretical foundation of a new power-quality (PQ) event classification algorithm. This approach is composed by two sequential processes: feature extraction and classification. The process of the feature exaction is to project a PQ waveform onto a low-dimension time-frequency representation (TFR), which is deliberately designed for maximizing the separability between classes. To maximize the classification accuracy, a distinct TFR is designed to separate a class or a class group from all other classes. Each separation procedure usually generates a binary decision, with an exception case when "other events" are considered. The classifiers include a Heaviside-function linear classifier and neural network classifiers with feedforward structure.
This paper, the second part of this paper series, presents an application and the performance evaluation of the proposed algorithm. First, several optimized TFRs for the classification task are designed by training signals from all classes. The design of an optimized TFR is essentially the design of an optimized kernel function for smoothing the time-frequency ambiguity plane. Second, the parameters of classifiers are trained by the training signals from all classes. Third, testing PQ waveforms are supplied to the system and the classification performance of proposed system is evaluated. Fourth, a digital signal processor (DSP)-based hardware system is implemented and the real-time monitoring capability of this method is evaluated.
This algorithm has been successfully tested with 860 real world PQ events that cover five classes: voltage sags, harmonics, high-frequency capacitor switching, low-frequency capacitor switching, and normal variations (normal noise). An average recognition rate of 98.0% was achieved. The DSP-based hardware system is capable of processing a five-cycle (83.3 ms) PQ waveform within 11.2 ms. The real-time computing capability of this proposed method has been verified with this result. This paper shows great potential of building a new generation of high accuracy and high capability real-time PQ monitors by integrating the proposed method with modern DSP technologies.
II. OPTIMAL TFR DESIGN AND CLASSIFIER TRAINING

A. TFR Design
Using the method presented in the theory part of this paper [1] , a small group of PQ waveform events covering five PQ classes mentioned above is used here for TFR design and classifier training. This group is composed of 50 signals from each individual class. Therefore, a total number of 250 real-world voltage waveforms are utilized for training purposes.
In this algorithm, a task for classifying signal classes requires kernels corresponding to optimized TFRs. Because five classes are under consideration in this application, four different kernels are designed: harmonics kernel, sag kernel, capacitor switching kernel, and capacitor high-frequency switching kernel. TABLE I  HARMONICS KERNEL FEATURE RANKING   TABLE II  VOLTAGE SAG KERNEL FEATURE RANKING   TABLE III  CAPACITOR SWITCHING KERNEL FEATURE RANKING low-frequency switching} from the remaining class normal variations. The capacitor high-frequency switching kernel is separate the class capacitor high-frequency switching from the remaining class capacitor low-frequency switching. Table 1 shows how each kernel is utilized in the classification process.
The kernel design follows the procedures explained in the first part. Tables I-IV show the detailed process for designing four different kernels. The number of feature points required for each kernel is searched by extensive classification experiments. Two criteria are used for determining the number of feature points for each kernel: first, the computation complexity needs to be as low as possible; second, the generated features need to produce good classification performance. The number of feature points needed for each kernel is searched from low to high, starting from 1. The optimal numbers of feature points for each individual kernel have been found: one for the harmonics kernel; two for the voltage sag kernel; three for the capacitor switching kernel; and three for the capacitor high-frequency switching kernel. Therefore, nine feature locations are selected for these four kernels.
In Table I to Table IV, the column "Location" means the coordinate on the time-frequency ambiguity plane, where represents discrete frequency shift, and represents discrete time lag [1] . To design the kernel , all 640 640 locations on the ambiguity plane are ranked based on their Fisher's discriminant scores , whose calculation method is described by equation (8) in [1] . The correlations of locations on the ambiguity plane are also taken into consideration in the feature selection process. Because the moduli of the points on the ambiguity plane are symmetric according to both horizontal and vertical axes of the plane, usually a group of four locations has the same Fisher's discriminant score. Only one among these four locations is selected as a feature location.
The results of the feature selection are shown in Table V . Among the nine feature points chosen, one is for the harmonics kernel, two for the sag kernel, three for the capacitor switching kernel and three for the capacitor fast switching kernel. Fig. 1 shows how these points are distributed on the time-frequency ambiguity plane. With this result, the feature extraction process is to map a 640-point waveform to a nine-point feature vector . , associated with the capacitor switching kernel. A separation can be found between the average features from the capacitor switching class group and remaining noncap-switch class {normal variation}. Fig. 2(d) shows the within-class average values of the sub-feature-vector , associated with the capacitor fast switching kernel. A separation can be found between the average features from the capacitor slow switching class and capacitor fast switching class.
In Fig. 2 (a) through Fig. 2(d) , the y-axis feature value variations are in different scales. All of the feature vectors from these kernels are normalized linearly to the range of before they are passed to the classifiers.
The classification-optimal TFRs corresponding to the four designed kernels are calculated from equation (4) in [1] , and shown in Figs. 3-6. As emphasized in the theory part, because of the unique mapping relationship between a kernel-smoothed ambiguity plane and a TFR, a shortcut has been taken in the algorithm. Therefore, the kernel-smoothed ambiguity plane is used in the algorithm instead of the TFR. The designed TFRs are still shown here because the underlying philosophy of the proposed method is to design TFRs that is optimized for classification.
There are two TFRs in each figure (Figs. 3-6 ): the average classification-optimal representation of current class , and the average of all remaining classes. Each of the designed TFRs is the result of a global smoothing process. With an explicit goal of classification, the accurate time-frequency structure of a signal is globally smoothed and becomes obscured because of cross-terms.
Not as in a wavelet or Fourier spectrum, the starting time of a transient event can not be identified from the designed TFRs. However, the distances between classes are emphasized and the high classification performance is guaranteed. In Figs. 3-6, each figure shows two TFRs that have a similar pattern but dramatically different energy content, because they correspond to the same kernel function but different feature values. According to the darkness of the TFR visualization, different classes show different levels of global energy contents: . Considering that the common 60 Hz components have been removed in the smoothing process, this sequence matches well with our expected energy content of different classes. However, uniform smoothing solely based on energy content of the spectrum is not enough and leads to classification mistakes. Our smoothing procedures also take into consideration time correlations of different frequency-domain peaks by the auto-correlation step. This is the reason why the capacitor switching TFR in Fig. 5 and the voltage sag TFR in Fig. 4 show nonuniform patterns. However, it is difficult to provide further interpretation of these different patterns, because good separation between classes instead of accurate representation of a signal is the goal of the TFR design process. The true time-frequency structure has been smoothed across the whole TFR. The TFRs designed in this paper are quadratic TFRs [1] . Because of cross-terms, the visual interpretation has been widely recognized as a constraint of quadratic TFRs, although they generally have better time-frequency resolutions than linear TFRs (e.g., WT, STFT, etc.).
B. Classifier Training
Four classifiers are used in this algorithm to transform the extracted information in the feature vectors into binary decisions.
In Fig. 2 The four classifiers used in this paper include a Heaviside linear classifier and three ANN classifiers with three-layer feedforward structures. The neuron numbers for the three ANNs are 2-12-2 (input-hidden-output), 3-10-2, and 3-10-2, respectively. The training and testing methods for the four classifiers adopted in this algorithm are detailed in Section IV-B in [1] . Traditional transfer and training functions are adopted. The inputs to the ANN are the normalized feature values and the output of the ANN is the binary decision made. The target vectors and are used for the two decision patterns in the output layer.
The training process is conducted with 50 waveforms from each class. The threshold value for the linear classifier and the weight/bias matrices for the ANN classifiers are obtained from the training.
III. WAVEFORM PROCESSING ALGORITHM
As shown in Fig. 7 , after the kernel design and classifier training are completed, the waveform processing algorithm becomes straightforward. Note that in the example application presented in this paper, only five classes are considered. The way this algorithm handles other events, i.e., the dash line box in Fig. 7 , will be discussed in details in Section VI, when the scalability of this method is analyzed.
For a 640-point input signal, nine feature points are calculated, using equations (5) and (6) . These nine real values are then sequentially sent to four different linear and ANN classifiers for making the final classification decision. 
IV. DSP-BASED REAL-TIME HARDWARE IMPLEMENTATION
To verify the real-time computing capability of the proposed algorithm, a DSP-based hardware system is implemented with the proposed algorithm as the embedded processing engine.
A. System Setup
A global block diagram for the monitoring system is shown in Fig. 8 . The algorithm is implemented with a Texas Instruments (TI) TMS320VC5416 digital signal processor (DSP) with the TI THS1206 12-b 6 MSPS analog to digital converter. A TI TMS320VC5416 DSP Starter Kit (DSK) is used as the host board with the THS1206 mounted on a daughter card. In deployment mode, the input signal is passed through a potential transformer and then fed to the ADC. In the testing mode, the input signal is fed either from a function generator or from a PC station via a USB port. While the response time of the system is measured with the input waveform from a function generator, the correctness of the algorithm implementation is verified with the input from a PC station via a USB port.
The TMS302VC5416 is a fixed-point DSP processor with 128 kB of on-chip memory and a 160-MHz clock speed, which can perform 160 MIPS. This processor has a 17 17 parallel multiply accumulator unit which allows single cycle multiply accumulate operations. While floating point multipliers 
B. Optimization for Real-Time Computing
Significant optimization efforts have been taken when programming the DSP in order to reduce the algorithm computation time.
1) Reduction of the Quantities to be Calculated:
The results of kernel and classifier training show that only nine kernel points from seven columns of ambiguity plane are needed for implementing the classification process. According to the algorithm, it is enough to calculate only seven kernel-related columns from the matrix and nine kernel points from the matrix . If the process window size is , the computation cost for calculating the entire autocorrelation matrix is multiplications, and the cost for calculating the entire ambiguity plane matrix is multiplications and additions. After reducing the number of quantities to be calculated as stated in the previous paragraph, the worst-case computation cost for the autocorrelation step is reduced to multiplications, and the worst-case cost for the ambiguity plane step is reduced to multiplications and additions. Since is and is then normalized for storage into a floating-point value. For each DFT operation, this normalization and addition step would occur once for the real part and once for the imaginary part. This all-integer optimization cuts the algorithm execution time in half, although a small precision loss is introduced.
3) Use Hard-Coded sin Table and cos Table: The discrete Fourier transform (DFT) is implemented with cos and sin functions instead of the exponential function, according to the Euler's Equation. Due to the focus on accuracy in the standard C math header file, the sin and cos are quite costly in processor time. Because the on-chip memory had not been completely consumed by other operations of the algorithm, the use of a lookup table for these functions is chosen. The values are stored as signed integers ranging from 32 767 to 32767, which is adequate considering a 12-b ADC is used.
V. EXPERIMENTAL RESULTS
This section provides results for both classification performance of the algorithm and real-time monitoring capability of the DSP-based hardware system.
A. Waveform Classification Performance
The proposed classification algorithm has been successfully tested with real world PQ data. The signal window used is five cycles, and the sampling rate is 128 points per cycle. The number of required training signals is determined by multiple training and testing experiments. In this application, 50 training PQ events per class are used for both the kernel design and classifier training. It is a reasonably low requirement on training data. The 250-waveform training data set is much smaller than the 524-waveform one in [2] and comparable to the 160-waveform one (only for classifying four types of events) in [3] . Generally, to successfully incorporate this classification method into a new power system, about 50 signals from each class need to be acquired in advance.
The 860 testing events, including the training events, are 200 harmonic events, 144 voltage sag events, 138 capacitor high-frequency switching events, 180 capacitor low-frequency switching events, and 194 normal variations. This is data set of comparable size to other recent papers in this area (675 testing waveforms in [2] , and 507 testing waveforms in [3] .).
This method shows better results than previous work [4] , which uses a single kernel, instead of kernels, for a 6-class PQ event classification. Better performance of multiple kernel approach is not consistent with [5] . Therefore, both single-kernel and multiple-kernel techniques have advantages and disadvantages. The multiple-kernel method requires more efforts in kernel design than the single-kernel method, but leaves less work to classifiers. The performance of these two methods is application dependent.
A detailed step-by-step results are listed in Table VI , and the final results for the testing experiment are listed in Table VII . An average recognition rate of 98.0% is achieved on the tested 860 PQ events.
The overall recognition rate on real world PQ data is higher than the results reported in [2] (waveform number: 675, recognition rate: 94.8%), [3] (waveform number: 507, recognition rate: 95.6%), and [6] (recognition rate: 89%). There are no post-processing procedures in the algorithm, which reduces the computation time and makes real-time monitoring possible. However, it is hard to judge the performance of different classification algorithms without testing these algorithms with the same set of testing data. There is a great need of creating a standard testing data set for PQ events classification, so that different algorithms can be appropriately evaluated.
B. Real-Time Monitoring Capability
The classification process of an 83.33-ms window takes 11.2 ms, which satisfies the real-time constraints in most PQ monitoring tasks, which is measured when the ADC is running on the same board and interrupting 960 times per second. Within the 11.2 ms, 1.8 ms are used for the autocorrelation step, 8.7 ms for the DFT step, and 0.70 ms for classifier step, as shown in Fig. 9 . Only 6% of the processing time is used for the classifier step, which shows that classifiers used in this paper do not become a computation bottleneck. The total response time can be reduced further by running the ADC card on another board.
With the same DSP system, the response time of this algorithm under different signal window sizes are estimated. Because only a constant number of columns are calculated in the autocorrelation step, the autocorrelation step takes time ( is the number of sampling points). Similarly, the DFT step also takes time, because only a constant number of DFT coefficients are needed. The same classifier step, which takes a constant time of 0.7 ms, is assumed. Table VIII shows that an estimated time of 0.11 cycles is needed for processing a half-cycle waveform, 0.17 cycles for a one-cycle waveform, 0.29 cycles for a two-cycle waveform, 0.42 cycles for a three-cycle waveform, 0.55 cycles for a four-cycle waveform, and 1.30 cycles for a ten-cycle waveform. This rough estimation implies various potential real-time applications of this method.
VI. SCALABILITY OF THIS METHOD
In reality, the PQ related events are more diverse than the five types covered in this application. This application should be considered more as a proof-of-concept case study than a ready-to-deploy system. Therefore, it is important to analyze the extensibility of the proposed algorithm.
In general, this algorithm can be conveniently modified in order to address larger scale PQ event classification problems. Depending on specific applications, different extended classification capabilities may be desired. Two scenarios are considered in the following discussions. In the first case (shown as "case 1" in Fig. 7) , assume that the goal is to classify all other events that do not belong to the five types in this paper as "unrecognized other events". To realize this, a third output can be added to the classifier at the lowest branch of the classification hierarchy, by slightly changing the ANN structure. In the second case (shown as "case 2" in Fig. 7) , assume that the goal is to classify types of PQ events, where . Only one additional kernel needs to be designed and added for every additional class. Given sufficient number of training signals, this can be easily implemented. The application in this paper also shows that each additional kernel usually only requires to select a very small constant number (this number is 1, 2, or 3 in this paper) of additional feature locations from the time-frequency ambiguity plane. The computation complexity grows linearly with the increase of event class number, and the real-time computing capability is retained.
VII. CONCLUSIONS AND FUTURE WORK
Identification and classification of voltage and current disturbances in power systems is an important task in power system monitoring and protection. A new classification algorithm for PQ disturbances have been proposed and successfully tested in this paper. This algorithm implements a novel feature extraction scheme in power engineering applications. By designing classification-optimal TFR, features are directly selected from the time-frequency ambiguity plane based on the Fisher's principle. Four linear and neural network classifiers with simple structures are used for the final decision-making. This method also shows low computation complexity in its implementation algorithm. In a five-class experiment, the presented method correctly classified 98% of 860 real world PQ events. This novel combination of methods shows promise for future development of fully automated monitoring systems with classification ability. Power system monitoring becomes more powerful by including the ability of classifying PQ disturbance signals automatically. The algorithm is implemented on a DSP-based hardware system and optimized according to the architecture of the DSP to meet the hard real-time constraints. The real-time computing capability of the algorithm is verified. The DSP-based system is capable of processing a five-cycle (83.3 ms) PQ waveform within 11.2 ms.
The crispy classification strategy presented in this paper has limitation in the cases that multiple types of PQ disturbances happen simultaneously in a power system. Based on the same feature extraction scheme, a fuzzy classification approach, which can provide more accurate and comprehensive information, is under our studies.
Besides the phenomena based classification, research efforts have been put on the development of a root-cause based classification system. The system will not only accurately discriminate the type of an event (e.g., voltage sag), but also diagnose possible causes of an event (e.g., motor starting or power system faults).
