46 research outputs found

    Views in Australia

    Get PDF
    Lithographs have the following titles: Agnes River, Corner Inlet, Gipps Land -- West side of Mt Arapiles -- Mitchell River -- Mt. Munda from St. Hubert, Yering -- McAlister Valley, Gipps Land -- Wentworth Rive

    Cost-sensitive classification problem (Poster)

    No full text
    In practical situations almost all classification problems are cost-sensitive or utility based one way or another. This exercise mimics a real situation in which students first have to translate a description into a datamining workflow, learn a prediction model, apply it to new data, and set up a testing strategy to estimate what will be the performance. The exercise is suitable for students following an introductory data mining course; it has been used in my introductory data mining class (3ECTS; 3rd BSc Computer Science students) for two years now. Students work on it in class for approximately 1 hour and finish the exercise at home. Solutions are to be sent to the lecturer and discussion the solutions the next lecture takes approximately 30 minutes

    What is data mining and how does it work?

    No full text
    Due to recent technological developments it became possible to generate and store increasingly larger datasets. Not the amount of data, however, but the ability to interpret and analyze the data, and to base future policies and decisions on the outcome of the analysis determines the value of data. The amounts of data collected nowadays not only offer unprecedented opportunities to improve decision procedures for companies and governments, but also hold great challenges. Many pre-existing data analysis tools did not scale up to the current data sizes. From this need, the research filed of data mining emerged. In this chapter we position data mining with respect to other data analysis techniques and introduce the most important classes of techniques developed in the area: pattern mining, classification, and clustering and outlier detection. Also related, supporting techniques such as pre-processing and database coupling are discussed

    Discrimination aware classification (Extended abstract)

    No full text
    No abstract

    Why unbiased computational processes can lead to discriminative decision procedures (Chapter 3)

    No full text
    Nowadays, more and more decision procedures are supported or even guided by automated processes. An important technique in this automation is data mining. In this chapter we study how such automatically generated decision support models may exhibit discriminatory behavior towards certain groups based upon, e.g., gender or ethnicity. Surprisingly, such behavior may even be observed when sensitive information is removed or suppressed and the whole procedure is guided by neutral arguments such as predictive accuracy only. The reason for this phenomenon is that most data mining methods are based upon assumptions that are not always satisfied in reality, namely, that the data is correct and represents the population well. In this chapter we discuss the implicit modeling assumptions made by most data mining algorithms and show situations in which they are not satisfied. Then we outline three realistic scenarios in which an unbiased process can lead to discriminatory models. The effects of the implicit assumptions not being fulfilled are illustrated by examples. The chapter concludes with an outline of the main challenges and problems to be solved

    Classification with no discrimination by preferential sampling

    No full text
    The concept of classification without discrimination is a new area of research. (Kamiran & Calders, 2009) introduced the idea of Classification with No Discrimination (CND) and proposed a solution based on "massaging" the data to remove the discrimination from it with the least possible changes. In this paper, we propose a new solution to the CND problem by introducing a sampling scheme for making the data discrimination free instead of relabeling the dataset. On the resulting non-discriminatory dataset we then learn a classifier. This new method is not only less intrusive as compared to the "massaging" but also outperforms the "reweighing" approach of (Calders et al., 2009). The proposed method has been implemented and experimental results on the Census Income dataset show promising results: in all experiments our method performs onpar with the state-of-the art non-discriminatory techniques

    Cost-sensitive classification problem (Poster)

    No full text
    In practical situations almost all classification problems are cost-sensitive or utility based one way or another. This exercise mimics a real situation in which students first have to translate a description into a datamining workflow, learn a prediction model, apply it to new data, and set up a testing strategy to estimate what will be the performance. The exercise is suitable for students following an introductory data mining course; it has been used in my introductory data mining class (3ECTS; 3rd BSc Computer Science students) for two years now. Students work on it in class for approximately 1 hour and finish the exercise at home. Solutions are to be sent to the lecturer and discussion the solutions the next lecture takes approximately 30 minutes

    Introduction to the special section on educational data mining

    No full text
    Educational Data Mining (EDM) is an emerging multidisciplinary research area, in which methods and techniques for exploring data originating from various educational information systems have been developed. EDM is both a learning science, as well as a rich application area for data mining, due to the growing availability of educational data. EDM contributes to the study of how students learn, and the settings in which they learn. It enables data-driven decision making for improving the current educational practice and learning material. We present a brief overview of EDM and introduce four selected EDM papers representing a crosscut of different application areas for data mining in education

    Discrimination aware classification (Extended abstract)

    No full text
    No abstract
    corecore