123 research outputs found

    Hybrid methodology for Markovian epidemic models

    Get PDF
    In this thesis, we introduce a hybrid discrete-continuous approach suitable for analysing a wide range of epidemiological models, and an approach for improving parameter estimation from data describing the early stages of an outbreak. We restrict our attention to epidemiological models with continuous-time Markov chain (CTMC) dynamics, a ubiquitous framework also commonly used for modelling telecommunication networks, chemical reactions and evolutionary genetics. We introduce our methodology in the framework of the well-known Susceptible–Infectious–Removed (SIR) model, one of the simplest approaches for describing the spread of an infectious disease. We later extend it to a variant of the Susceptible–Exposed–Infectious– Removed (SEIR) model, a generalisation of the SIR CTMC that is more realistic for modelling the initial stage of many outbreaks. Compartmental CTMC models are attractive due to their stochastic individual-to-individual representation of disease transmission. This feature is particularly important when only a small number of infectious individuals are present, during which stage the probability of epidemic fade out is considerable. Unfortunately, the simple SIR CTMC has a state space of order N², where N is the size of the population being modelled, and hence computational limits are quickly reached as N increases. There are a number of approaches towards dealing with this issue, most of which are founded on the principal of restricting one’s attention to the dynamics of the CTMC on a subset of its state space. However, two highly-efficient approaches published in 1970 and 1971 provide a promising alternative to these approaches. The fluid limit [Kurtz, 1970] and diffusion limit [Kurtz, 1971] are large-population approximations of a particular class of CTMC models which approximate the evolution of the underlying CTMC by a deterministic trajectory and a Gaussian diffusion process, respectively. These large-population approximations are governed by a compact system of ordinary differential equations and are suitably accurate so long as the underlying population is sufficiently large. Unfortunately, they become inaccurate if the population of at least one compartment of the underlying CTMC is close to an absorbing boundary, such as during the initial stages of an outbreak. It follows that a natural approach to approximating a CTMC model of a large population is to adopt a hybrid framework, whereby CTMC dynamics are utilised during the initial stages of the outbreak and a suitable large-population approximation is utilised otherwise. In the framework of the SIR CTMC, we present a hybrid fluid model and a hybrid diffusion model which utilise CTMC dynamics while the number of infectious individuals is low and otherwise utilises the fluid limit and the diffusion limit, respectively. We illustrate the utility of our hybrid methodology in computing two key quantities, the distribution of the duration of the outbreak and the distribution of the final size of the outbreak. We demonstrate that the hybrid fluid model provides a suitable approximation of the distribution of the duration of the outbreak and the hybrid diffusion model provides a suitable approximation of the distribution of the final size of the outbreak. In addition, we demonstrate that our hybrid methodology provides a substantial advantage in computational-efficiency over the original SIR CTMC and is superior in accuracy to similar hybrid large-population approaches when considering mid-sized populations. During the initial stages of an outbreak, calibrating a model describing the spread of the disease to the observed data is fundamental to understanding and potentially controlling the disease. A key factor considered by public health officials in planning their response to an outbreak is the transmission potential of the disease, a factor which is informed by estimates of the basic reproductive number, R₀, defined as the average number of secondary cases resulting from a single infectious case in a naive population. However, it is often the case that estimates of R₀ based on data from the initial stages of an outbreak are positively biased. This bias may be the result of various features such as the geography and demography of the outbreak. However, a consideration which is often overlooked is that the outbreak was not detected until such a time as it had established a considerable chain of transmissions, therefore effectively overcoming initial fade out. This is an important feature because the probability of initial fade out is often considerable, making the event that the outbreak becomes established somewhat unlikely. A straightforward way of accounting for this is to condition the model on a particular event, which models the disease overcoming initial fade out. In the framework of both the SIR CTMC and the SEIR CTMC we present a conditioned approach to estimating R₀ from data on the initial stages of an outbreak. For the SIR CTMC, we demonstrate that in certain circumstances, conditioning the model on effectively overcoming initial fade out reduces bias in estimates of R₀ by 0.3 on average, compared to the original CTMC model. Noting that the conditioned model utilises CTMC dynamics throughout, we demonstrate the flexibility of our hybrid methodology by presenting a conditioned hybrid diffusion approach for estimating R₀. We demonstrate that our conditioned hybrid diffusion approach still provides estimates of R₀ which exhibit less bias than under an unconditioned hybrid diffusion model, and that the diffusion methodology enables us to consider larger outbreaks then would have been computationally-feasible in the original conditioned CTMC framework. We demonstrate the flexibility of our conditioned hybrid approach by applying it to a variant of the SEIR CTMC and using it to estimate R₀ from a range of real outbreaks. In so doing, we utilise a truncation rule to ensure the initial CTMC dynamics are computationally-feasible.Thesis (Ph.D.) -- University of Adelaide, School of Mathematical Sciences, 201

    Progressive insular cooperative genetic programming algorithm for multiclass classification

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsIn contrast to other types of optimisation algorithms, Genetic Programming (GP) simultaneously optimises a group of solutions for a given problem. This group is named population, the algorithm iterations are named generations and the optimisation is named evolution as a reference o the algorithm’s inspiration in Darwin’s theory on the evolution of species. When a GP algorithm uses a one-vs-all class comparison for a multiclass classification (MCC) task, the classifiers for each target class (specialists) are evolved in a subpopulation and the final solution of the GP is a team composed of one specialist classifier of each class. In this scenario, an important question arises: should these subpopulations interact during the evolution process or should they evolve separately? The current thesis presents the Progressively Insular Cooperative (PIC) GP, a MCC GP in which the level of interaction between specialists for different classes changes through the evolution process. In the first generations, the different specialists can interact more, but as the algorithm evolves, this level of interaction decreases. At a later point in the evolution process, controlled through algorithm parameterisation, these interactions can be eliminated. Thus, in the beginning of the algorithm there is more cooperation among specialists of different classes, favouring search space exploration. With elimination of cooperation, search space exploitation is favoured. In this work, different parameters of the proposed algorithm were tested using the Iris dataset from the UCI Machine Learning Repository. The results showed that cooperation among specialists of different classes helps the improvement of classifiers specialised in classes that are more difficult to discriminate. Moreover, the independent evolution of specialist subpopulations further benefits the classifiers when they already achieved good performance. A combination of the two approaches seems to be beneficial when starting with subpopulations of differently performing classifiers. The PIC GP also presented great performance for the more complex Thyroid and Yeast datasets of the same repository, achieving similar accuracy to the best values found in literature for other MCC models.Diferente de outros algoritmos de otimiação computacional, o algoritmo de Programação Genética PG otimiza simultaneamente um grupo de soluções para um determinado problema. Este grupo de soluções é chamado população, as iterações do algoritmo são chamadas de gerações e a otimização é chamada de evolução em alusão à inspiração do algoritmo na teoria da evolução das espécies de Darwin. Quando o algoritmo GP utiliza a abordagem de comparação de classes um-vs-todos para uma classificação multiclasses (CMC), os classificadores específicos para cada classe (especialistas) são evoluídos em subpopulações e a solução final do PG é uma equipe composta por um especialista de cada classe. Neste cenário, surge uma importante questão: estas subpopulações devem interagir durante o processo evolutivo ou devem evoluir separadamente? A presente tese apresenta o algoritmo Cooperação Progressivamente Insular (CPI) PG, um PG CMC em que o grau de interação entre especialistas em diferentes classes varia ao longo do processo evolutivo. Nas gerações iniciais, os especialistas de diferentes classes interagem mais. Com a evolução do algoritmo, estas interações diminuem e mais tarde, dependendo da parametriação do algoritmo, elas podem ser eliminadas. Assim, no início do processo evolutivo há mais cooperação entre os especialistas de diferentes classes, o que favorece uma exploração mais ampla do espaço de busca. Com a eliminação da cooperação, favorece-se uma exploração mais local e detalhada deste espaço. Foram testados diferentes parâmetros do PG CPl utilizando o conjunto de dados iris do UCI Machine Learning Repository. Os resultados mostraram que a cooperação entre especialistas de diferentes classes ajudou na melhoria dos classificadores de classes mais difíceis de modelar. Além disso, que a evolução sem a interação entre as classes de diferentes especialidades beneficiou os classificadores quando eles já apresentam boa performance Uma combinação destes dois modos pode ser benéfica quando o algoritmo começa com classificadores que apresentam qualidades diferentes. O PG CPI também apresentou ótimos resultados para outros dois conjuntos de dados mais complexos o thyroid e o yeast, do mesmo repositório, alcançando acurácia similar aos melhores valores encontrados na literatura para outros modelos de CMC

    Caprine arthritis encephalitis virus disease modelling review

    Get PDF
    Mathematical modelling is used in disease studies to assess the economical impacts of diseases, as well as to better understand the epidemiological dynamics of the biological and environmental factors that are associated with disease spreading. For an incurable disease such as Caprine Arthritis Encephalitis (CAE), this knowledge is extremely valuable. However, the application of modelling techniques to CAE disease studies has not been significantly explored in the literature. The purpose of the present work was to review the published studies, highlighting their scope, strengths and limitations, as well to provide ideas for future modelling approaches for studying CAE disease. The reviewed studies were divided into the following two major themes: Mathematical epidemiological modelling and statistical modelling. Regarding the epidemiological modelling studies, two groups of models have been addressed in the literature: With and without the sexual transmission component. Regarding the statistical modelling studies, the reviewed articles varied on modelling assumptions and goals. These studies modelled the dairy production, the CAE risk factors and the hypothesis of CAE being a risk factor for other diseases. Finally, the present work concludes with further suggestions for modelling studies on CAE

    Multi-Algorithm Clustering Analysis for Characterizing Cow Productivity on Automatic Milking Systems Over Lactation Periods

    Get PDF
    Rebuli, K. B., Ozella, L., Vanneschi, L., & Giacobini, M. (2023). Multi-Algorithm Clustering Analysis for Characterizing Cow Productivity on Automatic Milking Systems Over Lactation Periods. Computers And Electronics In Agriculture, 211(August 2023), [108002]. https://doi.org/10.2139/ssrn.4435365, https://doi.org/10.1016/j.compag.2023.108002---This study is supported by Compagnia di San Paolo (ROL 63369 SIME 2020.1713) and by national funds through FCT (Fundação para a Ciência e a Tecnologia), under the project - UIDB/04152/2020 - Centro de Investigação em Gestão de Informação (MagIC)/NOVA IMSThis study proposes a novel approach for characterizing the milk productivity patterns of cows milked by Automatic Milking Systems (AMSs) within each lactation period, and for assessing its stability over time. AMSs enable real-time monitoring of udder health and milk quality during each milking episode, leading to an increasing amount of data that can be exploited to optimize herd management. Machine Learning (ML) algorithms are suitable for such situations, as they can handle multi-dimensional, heterogeneous, and large datasets. The methodology presented in this work used four clustering algorithms as unsupervised ML methods to cluster the cows within each lactation period. The clusters were grouped according to their productivity, and a merging index was defined to combine the clustering outcomes into a univocal result. The stability of the Productivity Groups (PGs) over time was analyzed. The proposed methodology was demonstrated using data from one farm with Holstein Friesians cows that exclusively uses the AMS. The PGs were found to be weakly stable over time, indicating that selecting cows for insemination based solely on their present or past lactation productivity may not be the most effective strategy. The study proposes using the same cows over all lactation periods to better understand the defining factors and dynamics of the PGs. Overall, the proposed framework provides a valuable tool for characterizing productivity groups and improving herd management practices in dairy farming.preprintepub_ahead_of_prin

    Novel applications for a noninvasive sampling method of the nasal mucosa

    Get PDF
    Reliable methods for sampling the nasal mucosa provide clinical researchers with key information regarding respiratory biomarkers of exposure and disease. For quick and noninvasive sampling of the nasal mucosa, nasal lavage (NL) collection has been widely used as a clinical tool; however, limitations including volume variability, sample dilution, and storage prevent NL collection from being used in nonlaboratory settings and analysis of low abundance biomarkers. In this study, we optimize and validate a novel methodology using absorbent Leukosorb paper cut to fit the nasal passage to extract epithelial lining fluid (ELF) from the nasal mucosa. The ELF sampling method limits the dilution of soluble mediators, allowing quantification of both high- and low-abundance soluble biomarkers such as IL-1β, IL-8, IL-6, interferon gamma-induced protein 10 (IP-10), and neutrophil elastase. Additionally, we demonstrate that this method can successfully detect the presence of respiratory pathogens such as influenza virus and markers of antibiotic-resistant bacteria in the nasal mucosa. Efficacy of ELF collection by this method is not diminished in consecutive-day sampling, and percent recovery of both recombinant IL-8 and soluble mediators are not changed despite freezing or room temperature storage for 24 h. Our results indicate that ELF collection using Leukosorb paper sampling of ELF provides a sensitive, easy-to-use, and reproducible methodology to collect concentrated amounts of soluble biomarkers from the nasal mucosa. Moreover, the methodology described herein improves upon the standard NL collection method and provides researchers with a novel tool to assess changes in nasal mucosal host defense status

    Hybrid Markov chain models of S-I-R disease dynamics

    Get PDF
    Deterministic epidemic models are attractive due to their compact nature, allowing substantial complexity with computational efficiency. This partly explains their dominance in epidemic modelling. However, the small numbers of infectious individuals at early and late stages of an epidemic, in combination with the stochastic nature of transmission and recovery events, are critically important to understanding disease dynamics. This motivates the use of a stochastic model, with continuous-time Markov chains being a popular choice. Unfortunately, even the simplest Markovian S-I-R model-the so-called general stochastic epidemic-has a state space of order [Formula: see text], where N is the number of individuals in the population, and hence computational limits are quickly reached. Here we introduce a hybrid Markov chain epidemic model, which maintains the stochastic and discrete dynamics of the Markov chain in regions of the state space where they are of most importance, and uses an approximate model-namely a deterministic or a diffusion model-in the remainder of the state space. We discuss the evaluation, efficiency and accuracy of this hybrid model when approximating the distribution of the duration of the epidemic and the distribution of the final size of the epidemic. We demonstrate that the computational complexity is [Formula: see text] and that under suitable conditions our approximations are highly accurate.Nicolas P. Rebuli, N. G. Bean, J. V. Ros

    Estimating the basic reproductive number during the early stages of an emerging epidemic

    Get PDF
    A novel outbreak will generally not be detected until such a time that it has become established. When such an outbreak is detected, public health officials must determine the potential of the outbreak, for which the basic reproductive numberRâ‚€ is an important factor. However, it is often the case that the resulting estimate of Râ‚€ is positively-biased for a number of reasons. One commonly overlooked reason is that the outbreak was not detected until such a time that it had become established, and therefore did not experience initial fade out. We propose a method which accounts for this bias by conditioning the underlying epidemic model on becoming established and demonstrate that this approach leads to a less-biased estimate of Râ‚€ during the early stages of an outbreak. We also present a computationally-efficient approximation scheme which is suitable for large data sets in which the number of notified cases is large. This methodology is applied to an outbreak of pandemic influenza in Western Australia, recorded in 2009.Nicolas P. Rebuli, N.G. Bean, J.V. Ros
    • …
    corecore