49,149 research outputs found

    "Selection of Input Parameters for Multivariate Classifiersin Proactive Machine Health Monitoring by Clustering Envelope Spectrum Harmonics"

    Get PDF
    In condition monitoring (CM) signal analysis the inherent problem of key characteristics being masked by noise can be addressed by analysis of the signal envelope. Envelope analysis of vibration signals is effective in extracting useful information for diagnosing different faults. However, the number of envelope features is generally too large to be effectively incorporated in system models. In this paper a novel method of extracting the pertinent information from such signals based on multivariate statistical techniques is developed which substantialy reduces the number of input parameters required for data classification models. This was achieved by clustering possible model variables into a number of homogeneous groups to assertain levels of interdependency. Representatives from each of the groups were selected for their power to discriminate between the categorical classes. The techniques established were applied to a reciprocating compressor rig wherein the target was identifying machine states with respect to operational health through comparison of signal outputs for healthy and faulty systems. The technique allowed near perfect fault classification. In addition methods for identifying seperable classes are investigated through profiling techniques, illustrated using Andrew’s Fourier curves

    A Clustering-Based Algorithm for Data Reduction

    Get PDF
    Finding an efficient data reduction method for large-scale problems is an imperative task. In this paper, we propose a similarity-based self-constructing fuzzy clustering algorithm to do the sampling of instances for the classification task. Instances that are similar to each other are grouped into the same cluster. When all the instances have been fed in, a number of clusters are formed automatically. Then the statistical mean for each cluster will be regarded as representing all the instances covered in the cluster. This approach has two advantages. One is that it can be faster and uses less storage memory. The other is that the number of new representative instances need not be specified in advance by the user. Experiments on real-world datasets show that our method can run faster and obtain better reduction rate than other methods

    Selection of Input Parameters for Multivariate Classifiers in Proactive Machine Health Monitoring by Clustering Envelope Spectrum Harmonics

    Get PDF
    In condition monitoring (CM) signal analysis the inherent problem of key characteristics being masked by noise can be addressed by analysis of the signal envelope. Envelope analysis of vibration signals is effective in extracting useful information for diagnosing different faults. However, the number of envelope features is generally too large to be effectively incorporated in system models. In this paper a novel method of extracting the pertinent information from such signals based on multivariate statistical techniques is developed which substantialy reduces the number of input parameters required for data classification models. This was achieved by clustering possible model variables into a number of homogeneous groups to assertain levels of interdependency. Representatives from each of the groups were selected for their power to discriminate between the categorical classes. The techniques established were applied to a reciprocating compressor rig wherein the target was identifying machine states with respect to operational health through comparison of signal outputs for healthy and faulty systems. The technique allowed near perfect fault classification. In addition methods for identifying seperable classes are investigated through profiling techniques, illustrated using Andrew’s Fourier curves

    A Divide-and-Conquer Solver for Kernel Support Vector Machines

    Full text link
    The kernel support vector machine (SVM) is one of the most widely used classification methods; however, the amount of computation required becomes the bottleneck when facing millions of samples. In this paper, we propose and analyze a novel divide-and-conquer solver for kernel SVMs (DC-SVM). In the division step, we partition the kernel SVM problem into smaller subproblems by clustering the data, so that each subproblem can be solved independently and efficiently. We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering. In the conquer step, the local solutions from the subproblems are used to initialize a global coordinate descent solver, which converges quickly as suggested by our analysis. By extending this idea, we develop a multilevel Divide-and-Conquer SVM algorithm with adaptive clustering and early prediction strategy, which outperforms state-of-the-art methods in terms of training speed, testing accuracy, and memory usage. As an example, on the covtype dataset with half-a-million samples, DC-SVM is 7 times faster than LIBSVM in obtaining the exact SVM solution (to within 10610^{-6} relative error) which achieves 96.15% prediction accuracy. Moreover, with our proposed early prediction strategy, DC-SVM achieves about 96% accuracy in only 12 minutes, which is more than 100 times faster than LIBSVM
    corecore