5 research outputs found
Constrained linear regression models for symbolic interval-valued variables
This paper introduces an approach to fitting a constrained linear regression model to interval-valued data. Each example of the learning set is described by a feature vector for which each feature value is an interval. The new approach fits a constrained linear regression model on the midpoints and range of the interval values assumed by the variables in the learning set. The prediction of the lower and upper boundaries of the interval value of the dependent variable is accomplished from its midpoint and range, which are estimated from the fitted linear regression models applied to the midpoint and range of each interval value of the independent variables. This new method shows the importance of range information in prediction performance as well as the use of inequality constraints to ensure mathematical coherence between the predicted values of the lower () and upper () boundaries of the interval. The authors also propose an expression for the goodness-of-fit measure denominated determination coefficient. The assessment of the proposed prediction method is based on the estimation of the average behavior of the root-mean-square error and square of the correlation coefficient in the framework of a Monte Carlo experiment with different data set configurations. Among other aspects, the synthetic data sets take into account the dependence, or lack thereof, between the midpoint and range of the intervals. The bias produced by the use of inequality constraints over the vector of parameters is also examined in terms of the mean-square error of the parameter estimates. Finally, the approaches proposed in this paper are applied to a real data set and performances are compared.
Identifying Early Help Referrals For Local Authorities With Machine Learning And Bias Analysis
Local authorities in England, such as Leicestershire County Council (LCC),
provide Early Help services that can be offered at any point in a young
person's life when they experience difficulties that cannot be supported by
universal services alone, such as schools. This paper investigates the
utilisation of machine learning (ML) to assist experts in identifying families
that may need to be referred for Early Help assessment and support. LCC
provided an anonymised dataset comprising 14360 records of young people under
the age of 18. The dataset was pre-processed, machine learning models were
build, and experiments were conducted to validate and test the performance of
the models. Bias mitigation techniques were applied to improve the fairness of
these models. During testing, while the models demonstrated the capability to
identify young people requiring intervention or early help, they also produced
a significant number of false positives, especially when constructed with
imbalanced data, incorrectly identifying individuals who most likely did not
need an Early Help referral. This paper empirically explores the suitability of
data-driven ML models for identifying young people who may require Early Help
services and discusses their appropriateness and limitations for this task