49,149 research outputs found
"Selection of Input Parameters for Multivariate Classifiersin Proactive Machine Health Monitoring by Clustering Envelope Spectrum Harmonics"
In condition monitoring (CM) signal analysis the inherent problem of key characteristics being masked by noise can be addressed by analysis of the signal envelope. Envelope analysis of vibration signals is effective in extracting useful information for diagnosing different faults. However, the number of envelope features is generally too large to be effectively incorporated in system models. In this paper a novel method of extracting the pertinent information from such signals based on multivariate statistical techniques is developed which substantialy reduces the number of input parameters required for data classification models. This was achieved by clustering possible model variables into a number of homogeneous groups to assertain levels of interdependency. Representatives from each of the groups were selected for their power to discriminate between the categorical classes. The techniques established were applied to a reciprocating compressor rig wherein the target was identifying machine states with respect to operational health through comparison of signal outputs for healthy and faulty systems. The technique allowed near perfect fault classification. In addition methods for identifying seperable classes are investigated through profiling techniques, illustrated using Andrew’s Fourier curves
A Clustering-Based Algorithm for Data Reduction
Finding an efficient data reduction method for large-scale
problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods
Selection of Input Parameters for Multivariate Classifiers in Proactive Machine Health Monitoring by Clustering Envelope Spectrum Harmonics
In condition monitoring (CM) signal analysis the inherent problem of key characteristics being masked by noise can be addressed by analysis of the signal envelope. Envelope analysis of vibration signals is effective in extracting useful information for diagnosing different faults. However, the number of envelope features is generally too large to be effectively incorporated in system models. In this paper a novel method of extracting the pertinent information from such signals based on multivariate statistical techniques is developed which substantialy reduces the number of input parameters required for data classification models. This was achieved by clustering possible model variables into a number of homogeneous groups to assertain levels of interdependency. Representatives from each of the groups were selected for their power to discriminate between the categorical classes. The techniques established were applied to a reciprocating compressor rig wherein the target was identifying machine states with respect to operational health through comparison of signal outputs for healthy and faulty systems. The technique allowed near perfect fault classification. In addition methods for identifying seperable classes are investigated through profiling techniques, illustrated using Andrew’s Fourier curves
A Divide-and-Conquer Solver for Kernel Support Vector Machines
The kernel support vector machine (SVM) is one of the most widely used
classification methods; however, the amount of computation required becomes the
bottleneck when facing millions of samples. In this paper, we propose and
analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the
division step, we partition the kernel SVM problem into smaller subproblems by
clustering the data, so that each subproblem can be solved independently and
efficiently. We show theoretically that the support vectors identified by the
subproblem solution are likely to be support vectors of the entire kernel SVM
problem, provided that the problem is partitioned appropriately by kernel
clustering. In the conquer step, the local solutions from the subproblems are
used to initialize a global coordinate descent solver, which converges quickly
as suggested by our analysis. By extending this idea, we develop a multilevel
Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction
strategy, which outperforms state-of-the-art methods in terms of training
speed, testing accuracy, and memory usage. As an example, on the covtype
dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in
obtaining the exact SVM solution (to within relative error) which
achieves 96.15% prediction accuracy. Moreover, with our proposed early
prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes,
which is more than 100 times faster than LIBSVM
- …