3 research outputs found

    Using random forest for reliable classification and cost-sensitive learning for medical diagnosis

    Get PDF
    Background: Most machine-learning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis. Results: In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show well-calibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a label-conditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is well-calibrated and able to control the specific risk of different class. Conclusion: The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a label-conditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on label-wise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class

    Providing a Service for Interactive Online Decision Aids through Estimating Consumers\u27 Incremental Search Benefits

    Get PDF
    Consumer information search has been a focus of research nowadays, especially in the context of online business environments. One of the research questions is to determine how much information to search (i.e., when to stop searching), since extensive literature on behavior science has revealed that consumers often search either “too little” or “too much”, even with the help of existing interactive online decision aids (IODAs). In order to address this issue, this paper introduces a new approach to IODAs with effective estimation of the incremental search benefits. In doing so, the approach incorporates two important aspects into consideration, namely point estimation and distribution estimation, so as to make use of the relevant information by combining both current and historical facts in reflecting the behavioral patterns of the consumers in search. Moreover, experiments based on data provided by Netflix illustrate that the proposed approach is effective and advantageous over existing ones

    Optimized Kernel-Based Conformal Predictor for Online Fault Detection

    Get PDF
    为了提高相符预测器的计算效率,在算法中引入基于核的度量学习.将其学习过程分解成2部分:先通过提高75%的训练样本的类可分性获得1个优化核;然后在优化的核空间中采用k近邻方法设计奇异度函数,并使用剩下的25%的样本实现标准的相符预测器算法.将新算法应用于田纳西-伊斯曼过程的多类故障诊断问题,实验结果表明,在保证高的预测效率的同时,新算法可以显著降低计算时间.In order to improve the computational efficiency of conformal predictora,procedure of adaptive kernel-based distance metric learning was incorporated in the algorithm.The learning process was divided into two stages.Firstlya,n op-timized kernel was obtained by increasing the class separability of 75% of the training samples.Secondlyt,he k nearest neighbor classifier was used to design a nonconformity measure function in the optimized kernel space.And then the stan-dard conformal predictor algorithm was conducted on the remaining 25% of the training samples.The new method was ap-plied to the multiple fault diagnosis of Tennessee Eastman process.The results show that the new algorithm provides substan-tial reductions in computational timea,nd ensures high predictive efficiency as well.厦门大学985二期工程信息创新平台资助项目(0000-x07204);厦门市科技计划资助项目(3502Z20083028
    corecore