101 research outputs found

    Outlier detection based on random forest

    Get PDF
    摘要: 提出一种基于随机森林方法的异常样本 (outliers)检测方法。仿真实验表明 ,与其他 2种基于 距离的异常样本检测技术相比 ,这种方法可以更好地提高模型的准确率 ,且具有较强的鲁棒性 ,在处 理大规模数据集时还能显著地减少计算时间。Abstract: It intr oduces an outliers detecti on method based on random forest . Compared with the other t wo common outliers detecti on methods based on distance, the p roposed method can i mp r ove the performance and robustness of the model and can als o reduce the computati on ti me

    Developing an Enterprise Credit Assessment System Based on Mixed Expert System

    Get PDF
    文章探讨将人工神经网络与专家系统结合应用于商业银行企业信用评估 ,并以一个混合型专家系统ECAMES(EnterpriseCreditAssessmentMixedExpertSystem)为例 ,阐述了混合型专家系统模型的设计与实现。Enterprise credit assessment is a very important,but rather complicated process for commercial banks,In which both quantitative data and qualitative data in different aspects must be considered.A new method,which combines the artificial neural network with the expert system is introduced in this paper. Based on the realization of a prototype system,the general scheme for designing and constructing such a mixed expert system has been discussed

    Separating feature selection and its application to enterprise credit assessment

    Get PDF
    评估指标体系的选取是企业信用评估的首要问题,它是一个特征选择问题。文章提出了一种针对SVM组合技术的拆分特征选择方法,其主要思想是对SVM组合中的各个分类器分别进行特征选择,再采用不同的特征子集作为各子分类器的输入,进行组合建模与预测。文章从filter和wrapper相结合的思想出发,进行了子分类器的特征选择;之后,针对企业信用评估问题的特点,采用了二叉树结构作为SVM的组合策略。实验表明,拆分特征选择方法能选出规模较小、具有一定差异的关键指标集,提高了模型的分类性能,并且具有计算简单,运行快速的优点。The selection of the evaluating index system is the key to enterprise credit assessment,which is essentially a feature selection problem.A separating feature selection approach concerning the combination of SVMs(support vector machine) is proposed,whose basic idea is to execute feature selection on each SVM in the combination and use the different selected feature subsets as the inputs.The composite feature selection based on filter and wrapper is used in the selection process of each SVM.The SVMs are then combined using binary tree structure adapted to the characteristic of enterprise credit assessment.Experiment shows that separating feature selection can select feature subsets with small scale and diversity and improve the classification ability of the model and reduce the computing time and complexity

    A model based on random forests for enterprises credit assessment

    Get PDF
    引入了一种能较好容忍噪声,且稳定性较高的组合分类器算法———随机森林(RF),建立企业信用评估模型;着重分析了适合RF的不平衡分类问题的处理方法,并介绍了模型参数的优化.通过与神经网络和支持向量机的对比实验,证实了该方法的有效性和优越性.We introduce a new classifier combination algorithm——random forests(RF),which is rather stable and robust with noise.By analyzing the real data,a new model based on RF is built and tested.Empirical results show that the new proposed model is effective and more advantageous than those of both neural network model and SVM model

    Satisfactory feature selection and its application enterprise credit assessment

    Get PDF
    The selection of evaluating index system is one of the key problems in enterprise credit assessment. It is essentially a satisfactory feature selection (SFS) problem. In this paper, several novel satisfactory-rate functions of feature set (SRFFS) are designed, in which the classification performance of the feature subset and its size are considered compromisingly. The accuracy of SVM Cross Validation is employed as evaluation criterion of classification ability, and the SFS algorithm is described in detail. Contrastive experiments are carried on SFS and three other different feature selection methods: S-SFS, Expert+GAFS and GAFS. Results show that SFS, which can pick out the feature subset with low dimension, high classification accuracy and balanced ranking performance, is superior to three other ones

    Credit risk assessment in commercial banks based on fuzzy support vector machines

    Get PDF
    Credit risk assessment plays an important role in banks credit risk management. The objective of credit assessment is to decide credit ranks, which denote the capacity of enterprises to meet their financial commitments. Traditional "one-versusone" approach has been commonly used in the multi-classification method based on Support Vector Machine (SVM). Since SVM for pattern recognition is based on binary classification, there will be unclassifiable regions when extended to multi-classification problems. Focus on this problem, a new credit risk assessment model based on fuzzy SVM is introduced in this paper that can give a reasonable classification for unclassifiable examples. Experiment results show that the fuzzy SVM method provides a better performance in generalization ability and assessment accuracy than conventional one-versus-one multi-classification approach

    Credit risk assessment of enterprise basing on neural network

    Get PDF
    企业信用等级评估是金融领域重要的问题 ,论文采用人工神经网络模型研究企业信用等级的评估问题 .按照企业样本在信用等级的分布状况来抽样 ,然后 ,根据企业样本性质的不同 ,将其分为制造业和非制造业两大类 .利用偏相关分析方法建立了企业信用评级的指标体系 .此外 ,还介绍了几种企业信用评级常用的评估模型 ,并将神经网络评估模型的性能和其他的信用评估模型作了比较 ,实验结果表明神经网络模型具有更好的预测准确性Credit rating of the enterprise is very important problem in the financial field. In this article, we researched this problem using artificial neural network model. We got the samples basing on the distribution of credit rating of the all samples. And then, we divided the samples of the enterprises into two sets, one is manufacture and the other is non manufacture, we also developed an indicators′ system using partial correlation method. Furthermore, we introduced many credit assessment models in common use and compared the result of our model with that of some of them. The result from the experiment shows that our model is more applicable and suitable

    Charge transfer in slow collisions of O8+ and Ar8+ ions with H(1s) below 2 keV/amu

    Get PDF
    We calculated the charge-transfer cross sections for O8++H collisions for energies from 1 eV/amu to 2 keV/amu, using the recently developed hyperspherical close-coupling method. In particular, the discrepancy for electron capture to the n=6 states of O7+ from the previous theoretical calculations is further analyzed. Our results indicate that at low energies (below 100 eV/amu) electron capture to the n=6 manifold of O7+ becomes dominant. The present results are used to resolve the long-standing discrepancies from the different elaborate semiclassical calculations near 100 eV/amu. We have also performed the semiclassical atomic orbital close-coupling calculations with straight-line trajectories. We found the semiclassical calculations agree with the quantal approach at energy above 100 eV/amu, where the collision occurs at large impact parameters. Calculations for Ar8++H collisions in the same energy range have also been carried out to analyze the effect of the ionic core on the subshell cross sections. By using diabatic molecular basis functions, we show that converged results can be obtained with small numbers of channels

    Multi-classifier Combination for banks credit risk assessment

    Get PDF
    Credit risk assessment problem belongs essentially to a classification problem. In this paper, a Multi-classifier Combination algorithm has been developed for banks credit risk assessment. We adopt Back-Propagation (BP) algorithm as the meta-learning algorithm and compared the methods of Bagging and Boosting to construct the Multi-classifier System (MCS). Experimental results on real client's data illustrate the effectiveness of the proposed method

    Using random forest for reliable classification and cost-sensitive learning for medical diagnosis

    Get PDF
    Background: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. Results: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. Conclusion: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class
    corecore