5 research outputs found

    Classifying Multiple imbalanced attributes in relational data

    Get PDF
    Real-world data are often stored as relational database systems with different numbers of significant attributes. Unfortunately, most classification techniques are proposed for learning from balanced nonrelational data and mainly for classifying one single attribute. In this paper, we propose an approach for learning from relational data withthe specific goal of classifying multiple imbalanced attributes. In our approach, we extend a relational modelling technique (PRMs-IM) designed for imbalanced relational learning to deal with multiple imbalanced attributes classification. We address the problem of classifying multiple imbalanced attributes by enriching the PRMs-IM with the 'Bagging' classification ensemble. We evaluate our approach on real-world imbalanced student relational data and demonstrate its effectiveness in predicting student performance

    Cost-Sensitive Learning with Conditional Markov Networks

    No full text
    1 Introduction Social Network Analysis has long been an important field ofresearch in the social sciences. Recent developments such as the proliferation of the online communities and communi-cation networks has shown the need for scalable techniques for extracting, analyzing and mining large real-world socialnetworks. These networks consist of entities linked by various relations. Predictive models which exploit both the at-tributes of entities and relations and their relational patterns are important for identifying key actors and important (oranomalous) links

    Cost-sensitive learning with conditional markov networks

    No full text
    There has been a recent, growing interest in classification and link prediction in structured domains. Methods such as conditional random fields and relational Markov networks support flexible mechanisms for modeling correlations due to the link structure. In addition, in many structured domains, there is an interesting structure in the risk or cost function associated with different misclassifications. There is a rich tradition of cost-sensitive learning applied to unstructured (IID) data. Here we propose a general framework which can capture correlations in the link structure and handle structured cost functions. We present two new cost-sensitive structured classifiers based on maximum entropy principles. The first determines the cost-sensitive classification by minimizing the expected cost of misclassification. The second directly determines the cost-sensitive classification without going through a probability estimation step. We contrast these approaches with an approach which employs a standard 0/1-loss structured classifier to estimate class conditional probabilities followed by minimization of the expected cost of misclassification and with a cost-sensitive IID classifier that does not utilize the correlations present in the link structure. We demonstrate the utility of our cost-sensitive structured classifiers with experiments on both synthetic and real-world data.
    corecore