Finite Mixture Models based on Scale Mixtures of Skew-Normal distributions applied to serological data

Abstract

Serological data can be described as a mixture of distributions, with each mixture component representing a serological population (e.g. seronegative and seropositive population). In seroepidemiological studies of infectious diseases, mixture models with Normal distribution are mostly used, which implies that the components that make up the mixture are approximately symmetric. However, it has been observed that, especially in seropositive populations, it is possible to observe skewness to the left, leading to the violation of the assumption of normality underlying the data. Thus, and in order to capture the possible skewness in serological data, the family of Scale Mixtures of Skew-Normal (SMSN) distributions is used, of which the Skew-Normal distribution and the Skew-t distribution are particular cases. In the case of the Skew-t distribution, being a heavy-tailed distribution, it allows capturing the possible existence of outliers. In addition to the models used to describe the behavior of the serological data, the issue of estimating the cutoff point for classifying an individual as seropositive is explored. In this sense, two perspectives on the problem are presented: one in which the true state of the disease is unknown; another in which this state is known a priori. The generalization of the use of a cutoff point without statistical methodology to support the estimation of this point may have consequences in the seroprevalence of a population, that is, in the proportion of seropositive individuals. Thus, three methods based on mixture models are proposed in this work for estimating the cutoff point when the true infection status is unknown

    Similar works