7 research outputs found

    Aggregative quantification for regression

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/s10618-013-0308-zThe problem of estimating the class distribution (or prevalence) for a new unlabelled dataset (from a possibly different distribution) is a very common problem which has been addressed in one way or another in the past decades. This problem has been recently reconsidered as a new task in data mining, renamed quantification when the estimation is performed as an aggregation (and possible adjustment) of a single-instance supervised model (e.g., a classifier). However, the study of quantification has been limited to classification, while it is clear that this problem also appears, perhaps even more frequently, with other predictive problems, such as regression. In this case, the goal is to determine a distribution or an aggregated indicator of the output variable for a new unlabelled dataset. In this paper, we introduce a comprehensive new taxonomy of quantification tasks, distinguishing between the estimation of the whole distribution and the estimation of some indicators (summary statistics), for both classification and regression. This distinction is especially useful for regression, since predictions are numerical values that can be aggregated in many different ways, as in multi-dimensional hierarchical data warehouses. We focus on aggregative quantification for regression and see that the approaches borrowed from classification do not work. We present several techniques based on segmentation which are able to produce accurate estimations of the expected value and the distribution of the output variable. We show experimentally that these methods especially excel for the relevant scenarios where training and test distributions dramatically differ.We would like to thank the anonymous reviewers for their careful reviews, insightful comments and very useful suggestions. This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, GVA project PROME-TEO/2008/051, the COST-European Cooperation in the field of Scientific and Technical Research IC0801 AT, and the REFRAME project granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA), and funded by the Ministerio de Economia y Competitividad in Spain.Bella Sanjuán, A.; Ferri Ramírez, C.; Hernández Orallo, J.; Ramírez Quintana, MJ. (2014). Aggregative quantification for regression. Data Mining and Knowledge Discovery. 28(2):475-518. https://doi.org/10.1007/s10618-013-0308-zS475518282Alonzo TA, Pepe MS, Lumley T (2003) Estimating disease prevalence in two-phase studies. Biostatistics 4(2):313–326Anderson T (1962) On the distribution of the two-sample Cramer–von Mises criterion. Ann Math Stat 33(3):1148–1159Bakar AA, Othman ZA, Shuib NLM (2009) Building a new taxonomy for data discretization techniques. In: Proceedings of 2nd conference on data mining and optimization (DMO’09), pp 132–140Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2009a) Calibration of machine learning models. In: Handbook of research on machine learning applications. IGI Global, HersheyBella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2009b) Similarity-binning averaging: a generalisation of binning calibration. In: International conference on intelligent data engineering and automated learning. LNCS, vol 5788. Springer, Berlin, pp 341–349Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2010) Quantification via probability estimators. In: International conference on data mining, ICDM2010, pp 737–742Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2012) On the effect of calibration in classifier combination. Appl Intell. doi: 10.1007/s10489-012-0388-2Chan Y, Ng H (2006) Estimating class priors in domain adaptation for word sense disambiguation. In: Proceedings of the 21st international conference on computational linguistics and the 44th annual meeting of the Association for Computational Linguistics, pp 89–96Chawla N, Japkowicz N, Kotcz A (2004) Editorial: special issue on learning from imbalanced data sets. ACM SIGKDD Explor Newsl 6(1):1–6Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Prieditis A, Russell S (eds) Proceedings of the twelfth international conference on machine learning. Morgan Kaufmann, San Francisco, pp 194–202Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recogn Lett 30(1):27–38Flach P (2012) Machine learning: the art and science of algorithms that make sense of data. Cambridge University Press, CambridgeForman G (2005) Counting positives accurately despite inaccurate classification. In: Proceedings of the 16th European conference on machine learning (ECML), pp 564–575Forman G (2006) Quantifying trends accurately despite classifier error and class imbalance. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 157–166Forman G (2008) Quantifying counts and costs via classification. Data Min Knowl Discov 17(2):164–206Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/mlGonzález-Castro V, Alaiz-Rodríguez R, Alegre E (2012) Class distribution estimation based on the Hellinger distance. Inf Sci 218(1):146–164Hastie TJ, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, BerlinHernández-Orallo J, Flach P, Ferri C (2012) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res (JMLR) 13:2813–2869Hodges J, Lehmann E (1963) Estimates of location based on rank tests. Ann Math Stat 34(5):598–611Hosmer DW, Lemeshow S (2000) Applied logistic regression. Wiley, New YorkHwang JN, Lay SR, Lippman A (1994) Nonparametric multivariate density estimation: a comparative study. IEEE Trans Signal Process 42(10):2795–2810Hyndman RJ, Bashtannyk DM, Grunwald GK (1996) Estimating and visualizing conditional densities. J Comput Graph Stat 5(4):315–336Moreno-Torres J, Raeder T, Alaiz-Rodríguez R, Chawla N, Herrera F (2012) A unifying view on dataset shift in classification. Pattern Recogn 45(1):521–530Neyman J (1938) Contribution to the theory of sampling human populations. J Am Stat Assoc 33(201):101–116Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74Raeder T, Forman G, Chawla N (2012) Learning from imbalanced data: evaluation matters. Data Min 23:315–331Sánchez L, González V, Alegre E, Alaiz R (2008) Classification and quantification based on image analysis for sperm samples with uncertain damaged/intact cell proportions. In: Proceedings of the 5th international conference on image analysis and recognition. LNCS, vol 5112. Springer, Heidelberg, pp 827–836Sturges H (1926) The choice of a class interval. J Am Stat Assoc 21(153):65–66Team R et al (2012) R: a language and environment for statistical computing. R Foundation for Statistical Computing, ViennaTenenbein A (1970) A double sampling scheme for estimating from binomial data with misclassifications. J Am Stat Assoc 65(331):1350–1361Weiss G (2004) Mining with rarity: a unifying framework. ACM SIGKDD Explor Newsl 6(1):7–19Weiss G, Provost F (2001) The effect of class distribution on classifier learning: an empirical study. Technical Report ML-TR-44Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques with Java implementations. Elsevier, AmsterdamXiao Y, Gordon A, Yakovlev A (2006a) A C++ program for the Cramér–von Mises two-sample test. J Stat Softw 17:1–15Xiao Y, Gordon A, Yakovlev A (2006b) The L1-version of the Cramér-von Mises test for two-sample comparisons in microarray data analysis. EURASIP J Bioinform Syst Biol 2006:85769Xue J, Weiss G (2009) Quantification and semi-supervised classification methods for handling changes in class distribution. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, pp 897–906Yang Y (2003) Discretization for naive-bayes learning. PhD thesis, Monash UniversityZadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proceedings of the 8th international conference on machine learning (ICML), pp 609–616Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: The 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp 694–69

    A review onquantification learning

    Get PDF
    The task of quantification consists in providing an aggregate estimation (e.g. the class distribution in a classification problem) for unseen test sets, applying a model that is trained using a training set with a different data distribution. Several real-world applications demand this kind of methods that do not require predictions for individual examples and just focus on obtaining accurate estimates at an aggregate level. During the past few years, several quantification methods have been proposed from different perspectives and with different goals. This paper presents a unified review of the main approaches with the aim of serving as an introductory tutorial for newcomers in the fiel

    Multidimensional Prediction Models When the Resolution Context Changes

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-23525-7_31Multidimensional data is systematically analysed at multiple granularities by applying aggregate and disaggregate operators (e.g., by the use of OLAP tools). For instance, in a supermarket we may want to predict sales of tomatoes for next week, but we may also be interested in predicting sales for all vegetables (higher up in the product hierarchy) for next Friday (lower down in the time dimension). While the domain and data are the same, the operating context is different. We explore several approaches for multidimensional data when predictions have to be made at different levels (or contexts) of aggregation. One method relies on the same resolution, another approach aggregates predictions bottom-up, a third approach disaggregates predictions top-down and a final technique corrects predictions using the relation between levels. We show how these strategies behave when the resolution context changes, using several machine learning techniques in four application domains.This work was supported by the Spanish MINECO under grants TIN 2010-21062-C02-02 and TIN 2013-45732-C4-1-P, and the REFRAME project, granted by the European Coordinated Research on Longterm Challenges in Information and Communication Sciences Technologies ERA-Net (CHIST-ERA), and funded by MINECO in Spain (PCIN-2013-037) and by Generalitat Valenciana PROMETEOII2015/013.Martínez Usó, A.; Hernández Orallo, J. (2015). Multidimensional Prediction Models When the Resolution Context Changes. En Machine Learning and Knowledge Discovery in Databases. Springer. 509-524. https://doi.org/10.1007/978-3-319-23525-7_31S509524Agrawal, R., Gupta, A., Sarawagi, S.: Modeling multidimensional databases. In: Proceedings of the Thirteenth International Conference on Data Engineering, ICDE 1997, pp. 232–243. IEEE Computer Society (1997)Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.: Quantification via probability estimators. In: IEEE ICDM, pp. 737–742 (2010)Bella, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Aggregative quantification for regression. DMKD 28(2), 475–518 (2014)Bickel, R.: Multilevel analysis for applied research: It’s just regression! Guilford Press (2012)Cabibbo, L., Torlone, R.: A logical approach to multidimensional databases. In: Schek, H.-J., Saltor, F., Ramos, I., Alonso, G. (eds.) EDBT 1998. LNCS, vol. 1377, p. 183. Springer, Heidelberg (1998)Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM Sigmod Record 26(1), 65–74 (1997)Chen, B.C.: Cube-Space Data Mining. ProQuest (2008)Chen, B.C., Chen, L., Lin, Y., Ramakrishnan, R.: Prediction cubes. In: Proc. of the 31st Intl. Conf. on Very Large Data Bases, pp. 982–993 (2005)Datahub: Car fuel consumptions and emissions 2000–2013 (2013). http://datahub.io/dataset/car-fuel-consumptions-and-emissionsDhurandhar, A.: Using coarse information for real valued prediction. Data Mining and Knowledge Discovery 27(2), 167–192 (2013)Forman, G.: Quantifying counts and costs via classification. Data Min. Knowl. Discov. 17(2), 164–206 (2008)Goldstein, H.: Multilevel Statistical Models, vol. 922. John Wiley & Sons (2011)Golfarelli, M., Maio, D., Rizzi, S.: The dimensional fact model: a conceptual model for data warehouses. Intl. J. of Coop. Information Systems 7, 215–247 (1998)Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explor. 11(1), 10–18 (2009)Hernández-Orallo, J.: Probabilistic reframing for cost-sensitive regression. ACM Transactions on Knowledge Discovery from Data 8(3) (2014)IBM Corporation: Introduction to Aroma and SQL (2006). http://www.ibm.com/developerworks/data/tutorials/dm0607cao/dm0607cao.htmlKamber, M., Jenny, J.H., Chiang, Y., Han, J., Chiang, J.Y.: Metarule-guided mining of multi-dimensional association rules using data cubes. In: KDD, pp. 207–210 (1997)Lin, T., Yao, Y., Zadeh, L.: Data Mining, Rough Sets and Granular Computing. Studies in Fuzziness and Soft Computing. Physica-Verlag HD (2002)Páircéir, R., McClean, S., Scotney, B.: Discovery of multi-level rules and exceptions from a distributed database. In: Proc. of the 6th ACM SIGKDD Intl. Conf. on Knowledge discovery and data mining, pp. 523–532. ACM (2000)Pastor, O., Casamayor, J.C., Celma, M., Mota, L., Pastor, M.A., Levin, A.M.: Conceptual Modeling of Human Genome: Integration Challenges. In: Düsterhöft, A., Klettke, M., Schewe, K.-D. (eds.) Conceptual Modelling and Its Theoretical Foundations. LNCS, vol. 7260, pp. 231–250. Springer, Heidelberg (2012)Perlich, C., Provost, F.: Distribution-based aggregation for relational learning with identifier attributes. Machine Learning 62(1–2), 65–105 (2006)Team, R., et al.: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2012)Ramakrishnan, R., Chen, B.C.: Exploratory mining in cube space. Data Mining and Knowledge Discovery 15(1), 29–54 (2007)Raudenbush, S.W., Bryk, A.S.: Hierarchical linear models: applications and data analysis methods, vol. 1. Sage (2002)UCI Repository: UJIIndoorLoc data set (2014). http://archive.ics.uci.edu/ml/datasets/UJIIndoorLocVassiliadis, P.: Modeling multidimensional databases, cubes and cube operations. In: Proc. of the 10th SSDBM Conference, pp. 53–62 (1998

    Probabilistic reframing for cost-sensitive regression

    Full text link
    © ACM, 2014. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Knowledge Discovery from Data (TKDD), VOL. 8, ISS. 4, (October 2014) http://doi.acm.org/10.1145/2641758Common-day applications of predictive models usually involve the full use of the available contextual information. When the operating context changes, one may fine-tune the by-default (incontextual) prediction or may even abstain from predicting a value (a reject). Global reframing solutions, where the same function is applied to adapt the estimated outputs to a new cost context, are possible solutions here. An alternative approach, which has not been studied in a comprehensive way for regression in the knowledge discovery and data mining literature, is the use of a local (e.g., probabilistic) reframing approach, where decisions are made according to the estimated output and a reliability, confidence, or probability estimation. In this article, we advocate for a simple two-parameter (mean and variance) approach, working with a normal conditional probability density. Given the conditional mean produced by any regression technique, we develop lightweight “enrichment” methods that produce good estimates of the conditional variance, which are used by the probabilistic (local) reframing methods. We apply these methods to some very common families of costsensitive problems, such as optimal predictions in (auction) bids, asymmetric loss scenarios, and rejection rules.This work was supported by the MEC/MINECO projects CONSOLIDER-INGENIO CSD2007-00022 and TIN 2010-21062-C02-02, and TIN 2013-45732-C4-1-P and GVA projects PROMETEO/2008/051 and PROMETEO2011/052. Finally, part of this work was motivated by the REFRAME project (http://www.reframe-d2k.org) granted by the European Coordinated Research on Long-term Challenges in Information and Communication Sciences & Technologies ERA-Net (CHIST-ERA) and funded by Ministerio de Economia y Competitividad in Spain (PCIN-2013-037).Hernández Orallo, J. (2014). Probabilistic reframing for cost-sensitive regression. ACM Transactions on Knowledge Discovery from Data. 8(4):1-55. https://doi.org/10.1145/2641758S15584G. Bansal, A. Sinha, and H. Zhao. 2008. Tuning data mining methods for cost-sensitive regression: A study in loan charge-off forecasting. Journal of Management Information System 25, 3 (Dec. 2008), 315--336.A. P. Basu and N. Ebrahimi. 1992. Bayesian approach to life testing and reliability estimation using asymmetric loss function. Journal of Statistical Planning and Inference 29, 1--2 (1992), 21--31.A. Bella, C. Ferri, J. Hernández-Orallo, and M. J. Ramírez-Quintana. 2010. Quantification via probability estimators. In Proceedings of the 2010 IEEE International Conference on Data Mining. IEEE, 737--742.A. Bella, C. Ferri, J. Hernández-Orallo, and M. J. Ramírez-Quintana. 2013. Aggregative quantification for regression. Data Mining and Knowledge Discovery (2013), 1--44.A. Bella, C. Ferri, J. Hernández-Orallo, and M. J. Ramírez-Quintana. 2009. Calibration of machine learning models. In Handbook of Research on Machine Learning Applications. IGI Global, 128--146.A. Bella, C. Ferri, J. Hernández-Orallo, and M. J. Ramírez-Quintana. 2011. Using negotiable features for prescription problems. Computing 91, 2 (2011), 135--168.J. Bi and K. P. Bennett. 2003. Regression error characteristic curves. In Proceedings of the 20th International Conference on Machine Learning (ICML’03).Z. Bosnić and I. Kononenko. 2008. Comparison of approaches for estimating reliability of individual regression predictions. Data & Knowledge Engineering 67, 3 (2008), 504--516.Z. Bosnić and I. Kononenko. 2009. An overview of advances in reliability estimation of individual predictions in machine learning. Intelligent Data Analysis 13, 2 (2009), 385--401.L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Wadsworth.P. F. Christoffersen and F. X. Diebold. 1996. Further results on forecasting and model selection under asymmetric loss. Journal of Applied Econometrics 11, 5 (1996), 561--571.P. F. Christoffersen and F. X. Diebold. 1997. Optimal prediction under asymmetric loss. Econometric Theory 13 (1997), 808--817.I. Cohen and M. Goldszmidt. 2004. Properties and benefits of calibrated classifiers. Knowledge Discovery in Databases: PKDD 2004 (2004), 125--136.S. Crone. 2002. Training artificial neural networks for time series prediction using asymmetric cost functions. In Proceedings of the 9th International Conference on Neural Information Processing.J. Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research 7 (2006), 1--30.M. Dumas, L. Aldred, G. Governatori, and A. H. M. Ter Hofstede. 2005. Probabilistic automated bidding in multiple auctions. Electronic Commerce Research 5, 1 (2005), 25--49.C. Elkan. 2001. The foundations of cost-sensitive learning. In Proceedings of the 17th International Conference on Artificial Intelligence (’01), Bernhard Nebel (Ed.). San Francisco, CA, 973--978.G. Elliott and A. Timmermann. 2004. Optimal forecast combinations under general loss functions and forecast error distributions. Journal of Econometrics 122, 1 (2004), 47--79.T. Fawcett. 2006a. An introduction to ROC analysis. Pattern Recognition Letters 27, 8 (2006), 861--874.T. Fawcett. 2006b. ROC graphs with instance-varying costs. Pattern Recognition Letters 27, 8 (2006), 882--891.C. Ferri, P. Flach, and J. Hernández-Orallo. 2002. Learning decision trees using the area under the ROC curve. In Proceedings of the International Conference on Machine Learning. 139--146.C. Ferri, P. Flach, and J. Hernández-Orallo. 2003. Improving the AUC of probabilistic estimation trees. In Proceedings of the 14th European Conference on Machine Learning (ECML’03). Springer, 121--132.C. Ferri and J. Hernández-Orallo. 2004. Cautious classifiers. In ROC Analysis in Artificial Intelligence, 1st International Workshop, ROCAI-2004, Valencia, Spain, August 22, 2004, J. Hernández-Orallo, C. Ferri, N. Lachiche, and P. A. Flach (Eds.). 27--36.P. Flach. 2012. Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press.G. Forman. 2008. Quantifying counts and costs via classification. Data Mining and Knowledge Discovery 17, 2 (2008), 164--206.S. García and F. Herrera. 2008. An extension on statistical comparisons of classifiers over multiple data sets for all pairwise comparisons. The Journal of Machine Learning Research 9, 2677--2694 (2008), 66.R. Ghani. 2005. Price prediction and insurance for online auctions. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (KDD’05). ACM, New York, NY, 411--418.C. W. J. Granger. 1969. Prediction with a generalized cost of error function. Operational Research (1969), 199--207.C. W. J. Granger. 1999. Outline of forecast theory using generalized cost functions. Spanish Economic Review 1, 2 (1999), 161--173.P. Hall, J. Racine, and Q. Li. 2004. Cross-validation and the estimation of conditional probability densities. Journal of the American Statistical Association 99, 468 (2004), 1015--1026.P. Hall, R. C. L. Wolff, and Q. Yao. 1999. Methods for estimating a conditional distribution function. Journal of the American Statistical Association (1999), 154--163.T. J. Hastie, R. J. Tibshirani, and J. H. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.J. Hernández-Orallo. 2013. ROC curves for regression. Pattern Recognition 46, 12 (2013), 3395--3411.J. Hernández-Orallo, P. Flach, and C. Ferri. 2012. A unified view of performance metrics: Translating threshold choice into expected classification loss. Journal of Machine Learning Research 13 (2012), 2813--2869.J. Hernández-Orallo, P. Flach, and C. Ferri. 2013. ROC curves in cost space. Machine Learning 93, 1 (2013), 71--91.J. N. Hwang, S. R. Lay, and A. Lippman. 1994. Nonparametric multivariate density estimation: A comparative study. IEEE Transactions on Signal Processing 42, 10 (1994), 2795--2810.R. J. Hyndman, D. M. Bashtannyk, and G. K. Grunwald. 1996. Estimating and visualizing conditional densities. Journal of Computational and Graphical Statistics (1996), 315--336.N. Japkowicz and M. Shah. 2011. Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press.M. Jino, B. T. de Abreu, and others. 2010. Machine learning methods and asymmetric cost function to estimate execution effort of software testing. In Proceedings of the 2010 3rd International Conference on Software Testing, Verification and Validation (ICST’10). IEEE, 275--284.B. Kitts and B. Leblanc. 2004. Optimal bidding on keyword auctions. Electronic Markets 14, 3 (2004), 186--201.N. Lachiche and P. Flach. 2003. Improving accuracy and cost of two-class and multi-class probabilistic classifiers using ROC curves. In Proceedings of the International Conference on Machine Learning, Vol. 20-1. 416.H. Papadopoulos. 2008. Inductive conformal prediction: Theory and application to neural networks. Tools in Artificial Intelligence 18 (2008), 315--330.H. Papadopoulos, K. Proedrou, V. Vovk, and A. Gammerman. 2002. Inductive confidence machines for regression. In Machine Learning: ECML 2002, Tapio Elomaa, Heikki Mannila, and Hannu Toivonen (Eds.). Lecture Notes in Computer Science, Vol. 2430. Springer, Berlin, 185--194.H. Papadopoulos, V. Vovk, and A. Gammerman. 2011. Regression conformal prediction with nearest neighbours. Journal of Artificial Intelligence Research 40, 1 (2011), 815--840.T. Pietraszek. 2007. On the use of ROC analysis for the optimization of abstaining classifiers. Machine Learning 68, 2 (2007), 137--169.J. C. Platt. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers. MIT Press, Boston, 61--74.F. Provost and P. Domingos. 2003. Tree induction for probability-based ranking. Machine Learning 52, 3 (2003), 199--215.R Team and others. 2012. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.R. Ribeiro. 2011. Utility-based Regression. PhD thesis, Department of Computer Science, Faculty of Sciences, University of Porto.M. Rosenblatt. 1969. Conditional probability density and regression estimators. Multivariate Analysis II 25 (1969), 31.S. Rosset, C. Perlich, and B. Zadrozny. 2007. Ranking-based evaluation of regression models. Knowledge and Information Systems 12, 3 (2007), 331--353.R. E. Schapire, P. Stone, D. McAllester, M. L. Littman, and J. A. Csirik. 2002. Modeling auction price uncertainty using boosting-based conditional density estimation. In Proceedings of the International Conference on Machine Learning. 546--553.G. Shafer and V. Vovk. 2008. A tutorial on conformal prediction. Journal of Machine Learning Research 9 (2008), 371--421.J. A. Swets, R. M. Dawes, and J. Monahan. 2000. Better decisions through science. Scientific American 283, 4 (Oct. 2000), 82--87.R. D. Thompson and A. P. Basu. 1996. Asymmetric loss functions for estimating system reliability. In Bayesian Analysis in Statistics and Econometrics. John Wiley & Sons, 471--482.L. Torgo. 2005. Regression error characteristic surfaces. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, 697--702.L. Torgo. 2010. Data Mining with R. Chapman and Hall/CRC Press.L. Torgo and R. Ribeiro. 2007. Utility-based regression. Knowledge Discovery in Databases: PKDD 2007. 597--604.L. Torgo and R. Ribeiro. 2009. Precision and recall for regression. In Discovery Science. Springer, 332--346.P. Turney. 2000. Types of cost in inductive concept learning. Canada National Research Council Publications Archive.L. Wasserman. 2006. All of Nonparametric Statistics. Springer-Verlag, New York.M. P. Wellman, D. M. Reeves, K. M. Lochner, and Y. Vorobeychik. 2004. Price prediction in a trading agent competition. Journal of Artificial Intelligence Research 21 (2004), 19--36.K. Yu and M. C. Jones. 2004. Likelihood-based local linear estimation of the conditional variance function. Journal of the American Statistical Association 99, 465 (2004), 139--144.B. Zadrozny and C. Elkan. 2002. Transforming classifier scores into accurate multiclass probability estimates. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 694--699.A. Zellner. 1986. Bayesian estimation and prediction using asymmetric loss functions. Journal of the American Statistical Association (1986), 446--451.H. Zhao, A. P. Sinha, and G. Bansal. 2011. An extended tuning method for cost-sensitive regression and forecasting. Decision Support Systems
    corecore