5 research outputs found

    Balancing model complexity and inferential capability for disease modelling

    Get PDF
    Mathematical models for study of infectious diseases have a rich history but it is only in recent years that directly fitting highly complex models to data has become possible. This has lead to a quantum leap in the capabilities of mathematical modelling for contributing to an evidence base for policy decisions. There still remains a large gap between the most complex models we can simulate and the most complex models we can perform inference on resulting in a trade-off between model complexity and inferential capability. This thesis tackles three separate problems with this trade-off in mind. First, we study the dynamics of epidemics on degree heterogeneous clustered networks. Network models have many attractions but possess drawbacks such as one must generally resort to stochastic simulation for clustered networks, which represent realistic societal structure, as closed-form approximations of the dynamics do not hold in the highly clustered regime. Furthermore, data for these systems are hard to collect as they must measure the pairwise interactions of each individual - this lack of data limits the possibilities for applying inference. We develop a new model not requiring extensive simulations that approximates these dynamics more accurately than previous approaches, thus improving on the first problem mentioned. Second, across several chapters we analyse the role of household structure in the transmission and control of soil-transmitted helminths (STH). Starting with a hierarchical negative binomial regression for which inference can easily be performed but which neglects the non-independence of observations, we move on to develop a general methodology for constructing and performing Bayesian inference on stochastic household models that consider different transmission dynamics within and between the households in a population. This permits us to estimate the extent to which transmission occurs within, compared to between, households and simulate the effectiveness of various control strategies – with some exploiting the household structure. The limits to which this general methodology may be extended to arbitrary demographic classes and infection levels before inference with exact-likelihood methods no longer become computationally feasible is explored. Finally, we build a model of the global control programme for lymphatic filariasis at a regional level, forecasting the number of treatments required each year and their costs in order to reach elimination. A scenario where existing guidelines remained in place and a scenario where proposed guidelines incorporating a new treatment were considered. Our analysis was used by WHO as part of the evidence base for adopting precisely these new guidelines

    Developing Robust Models, Algorithms, Databases and Tools With Applications to Cybersecurity and Healthcare

    Get PDF
    As society and technology becomes increasingly interconnected, so does the threat landscape. Once isolated threats now pose serious concerns to highly interdependent systems, highlighting the fundamental need for robust machine learning. This dissertation contributes novel tools, algorithms, databases, and models—through the lens of robust machine learning—in a research effort to solve large-scale societal problems affecting millions of people in the areas of cybersecurity and healthcare. (1) Tools: We develop TIGER, the first comprehensive graph robustness toolbox; and our ROBUSTNESS SURVEY identifies critical yet missing areas of graph robustness research. (2) Algorithms: Our survey and toolbox reveal existing work has overlooked lateral attacks on computer authentication networks. We develop D2M, the first algorithmic framework to quantify and mitigate network vulnerability to lateral attacks by modeling lateral attack movement from a graph theoretic perspective. (3) Databases: To prevent lateral attacks altogether, we develop MALNET-GRAPH, the world’s largest cybersecurity graph database—containing over 1.2M graphs across 696 classes—and show the first large-scale results demonstrating the effectiveness of malware detection through a graph medium. We extend MALNET-GRAPH by constructing the largest binary-image cybersecurity database—containing 1.2M images, 133×more images than the only other public database—enabling new discoveries in malware detection and classification research restricted to a few industry labs (MALNET-IMAGE). (4) Models: To protect systems from adversarial attacks, we develop UNMASK, the first model that flags semantic incoherence in computer vision systems, which detects up to 96.75% of attacks, and defends the model by correctly classifying up to 93% of attacks. Inspired by UNMASK’s ability to protect computer visions systems from adversarial attack, we develop REST, which creates noise robust models through a novel combination of adversarial training, spectral regularization, and sparsity regularization. In the presence of noise, our method improves state-of-the-art sleep stage scoring by 71%—allowing us to diagnose sleep disorders earlier on and in the home environment—while using 19× less parameters and 15×less MFLOPS. Our work has made significant impact to industry and society: the UNMASK framework laid the foundation for a multi-million dollar DARPA GARD award; the TIGER toolbox for graph robustness analysis is a part of the Nvidia Data Science Teaching Kit, available to educators around the world; we released MALNET, the world’s largest graph classification database with 1.2M graphs; and the D2M framework has had major impact to Microsoft products, inspiring changes to the product’s approach to lateral attack detection.Ph.D
    corecore