192 research outputs found

    Practical selection of SVM parameters and noise estimation for SVM regression”, Neural

    Get PDF
    Abstract We investigate practical selection of hyper-parameters for support vector machines (SVM) regression (that is, 1-insensitive zone and regularization parameter C). The proposed methodology advocates analytic parameter selection directly from the training data, rather than re-sampling approaches commonly used in SVM applications. In particular, we describe a new analytical prescription for setting the value of insensitive zone 1; as a function of training sample size. Good generalization performance of the proposed parameter selection is demonstrated empirically using several low-and high-dimensional regression problems. Further, we point out the importance of Vapnik's 1-insensitive loss for regression problems with finite samples. To this end, we compare generalization performance of SVM regression (using proposed selection of 1-values) with regression using 'least-modulus' loss Ă°1 ÂĽ 0Ăž and standard squared loss. These comparisons indicate superior generalization performance of SVM regression under sparse sample settings, for various types of additive noise.

    Use of data mining tools for cut soil slope condition state identification

    Get PDF
    Introduction: Transportation systems play a fundamental rule in nowadays society. Indeed, every developed or in development country had invested and keep investing to build a complete, safe and functional transportation network. Now, the main concern, particularly for developed countries, is to keep it operational under all security conditions. However, due to the network extension and increased budget constraints, such task is difficult to accomplish. In the framework of transportations networks, particularly for highway and railway, slopes are perhaps the element for which its failure can have a strongest impact at several levels. Although there are some models and systems to detect slop failures, most of them were developed for natural slopes, presenting some constrains when applied to man-made slopes. Moreover, most of the existent systems were developed based on particular case studies or require information gathered from complex/expensive tests, which can represent an important applicability limitation. Aiming to overcome this drawback, we are taking advantage of the learning capabilities of flexible DM algorithms, such Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), which can model complex nonlinear mappings. Both algorithms were fitted to predict the condition state of a given slope according to a pre-defined classification scale contemplating four levels (classes). One of the premises of this work is to try to identify the real condition state of a given slop using information collected during routine inspections complemented with geometric, geologic and geographic data

    Cylindrical roller bearing fault diagnosis based on VMD-SVD and Adaboost classifier method

    Get PDF
    Fault diagnosis for cylindrical roller bearing is of great significance for industry. In order to excavate the features of the vibration signal adequately, and to construct an effective classifier for complex vibration signals, this paper proposed a new fault diagnosis method based on Variational Mode Decomposition (VMD), Singular Value Decomposition (SVD) and Adaboost classifier. Firstly, the VMD was applied to decompose the sampled vibration signal in time-frequency domain. Subsequently, the features were extracted by using SVD. Finally, the constructed Adaboost classifier were employed to fault detection and diagnosis, which were trained by using the extracted features. Experimental data measured in a rotating machinery fault diagnosis experiment platform was used to verify the proposed method. The results demonstrate that the proposed method was effective to detect and diagnose the outer ring fault and rolling element fault in cylindrical roller bearing

    Feature Extraction for Murmur Detection Based on Support Vector Regression of Time-Frequency Representations

    Get PDF
    This paper presents a nonlinear approach for time-frequency representations (TFR) data analysis, based on a statistical learning methodology - support vector regression(SVR), that being a nonlinear framework, matches recent findings on the underlying dynamics of cardiac mechanic activity and phonocardiographic (PCG) recordings. The proposed methodology aims to model the estimated TFRs, and extract relevant features to perform classification between normal and pathologic PCG recordings (with murmur). Modeling of TFR is done by means of SVR, and the distance between regressions is calculated through dissimilarity measures based on dot product. Finally, a k-nn classifier is used for the classification stage, obtaining a validation performance of 97.85%

    Support Vector Machine Approach for Non-Technical Losses Identification in Power Distribution Systems

    Get PDF
    Electricity consumer fraud is a problem faced by all power utilities. Finding efficient measurements for detecting fraudulent electricity consumption has been an active research area in recent years. In this paper,the approach towards nontechnical loss (NTL) detection in power utilities using an artificial intelligence based technique, Support Vector Machine (SVM), are presented. This approach provides a method of data mining, which involves feature extraction from past consumption data. This SVM based approach uses customer load profile information and additional attributes to expose abnormal behavior that is known to be highly correlated with NTL activities. Some key advantages of SVM in data clustering, among which is the easy way of using them to fit the data of a wide range of features are discussed here. Finally, some major weakness of using SVM in clustering for NTL identification are identified, which leads to motivate for the scope of Optimum-Path Forest, a new model of NTL identification

    Uplift Modeling with Multiple Treatments and General Response Types

    Full text link
    Randomized experiments have been used to assist decision-making in many areas. They help people select the optimal treatment for the test population with certain statistical guarantee. However, subjects can show significant heterogeneity in response to treatments. The problem of customizing treatment assignment based on subject characteristics is known as uplift modeling, differential response analysis, or personalized treatment learning in literature. A key feature for uplift modeling is that the data is unlabeled. It is impossible to know whether the chosen treatment is optimal for an individual subject because response under alternative treatments is unobserved. This presents a challenge to both the training and the evaluation of uplift models. In this paper we describe how to obtain an unbiased estimate of the key performance metric of an uplift model, the expected response. We present a new uplift algorithm which creates a forest of randomized trees. The trees are built with a splitting criterion designed to directly optimize their uplift performance based on the proposed evaluation method. Both the evaluation method and the algorithm apply to arbitrary number of treatments and general response types. Experimental results on synthetic data and industry-provided data show that our algorithm leads to significant performance improvement over other applicable methods
    • …
    corecore