4 research outputs found
A Decision Tree Approach to Predicting Recidivism in Domestic Violence
Domestic violence (DV) is a global social and public health issue that is
highly gendered. Being able to accurately predict DV recidivism, i.e.,
re-offending of a previously convicted offender, can speed up and improve risk
assessment procedures for police and front-line agencies, better protect
victims of DV, and potentially prevent future re-occurrences of DV. Previous
work in DV recidivism has employed different classification techniques,
including decision tree (DT) induction and logistic regression, where the main
focus was on achieving high prediction accuracy. As a result, even the diagrams
of trained DTs were often too difficult to interpret due to their size and
complexity, making decision-making challenging. Given there is often a
trade-off between model accuracy and interpretability, in this work our aim is
to employ DT induction to obtain both interpretable trees as well as high
prediction accuracy. Specifically, we implement and evaluate different
approaches to deal with class imbalance as well as feature selection. Compared
to previous work in DV recidivism prediction that employed logistic regression,
our approach can achieve comparable area under the ROC curve results by using
only 3 of 11 available features and generating understandable decision trees
that contain only 4 leaf nodes.Comment: 12 pages; Accepted at The 2018 Pacific-Asia Conference on Knowledge
Discovery and Data Mining (PAKDD
A decision tree approach to predicting recidivism in domestic violence
Domestic violence (DV) is a global social and public health issue that is highly gendered. Being able to accurately predict DV recidivism, i.e., re-offending of a previously convicted offender, can speed up and improve risk assessment procedures for police and front-line agencies, better protect victims of DV, and potentially prevent future re-occurrences of DV. Previous work in DV recidivism has employed different classification techniques, including decision tree (DT) induction and logistic regression, where the main focus was on achieving high prediction accuracy. As a result, even the diagrams of trained DTs were often too difficult to interpret due to their size and complexity, making decision-making challenging. Given there is often a trade-off between model accuracy and interpretability, in this work our aim is to employ DT induction to obtain both interpretable trees as well as high prediction accuracy. Specifically, we implement and evaluate different approaches to deal with class imbalance as well as feature selection. Compared to previous work in DV recidivism prediction that employed logistic regression, our approach can achieve comparable area under the ROC curve results by using only 3 of 11 available features and generating understandable decision trees that contain only 4 leaf nodes