271 research outputs found

    Propagation techniques in probabilistic expert systems

    Get PDF
    Techniques for the construction of probabilistic expert systems comprising both discrete and continuous random variables are presented. In particular we are concerned with how continuous random variables may be incorporated into an expert system - an area which has previously received relatively little attention. We investigate and extend the numeric techniques of other authors, and develop two new approaches. The first approach makes use of computer algebra. This exact technique enables a probability distribution to be expressed and manipulated in terms of its algebraic formula resulting in no loss of information. Our second approach is an approximate method based upon cubic spline interpolation. We constrain the probability density function of a continuous variable to a finite set of points at which we have both function values and first derivatives. These values may then be held in a potential table and treated in an almost identical fashion to discrete variables. While symbolic techniques are shown to be only appropriate in special cases, cubic spline interpolation, though less accurate, is widely applicable. We combine these techniques to form a hybrid methodology in which discrete variables, symbolic continuous variables, and spline interpolated continuous variables may exist not only in the same junction tree, but also in the same universe. We show how propagation algorithms may be constructed for these various cases and investigate how the means, variances and probability density functions of the marginal distributions in the system may be generated. It is shown how evidence of either a numeric or a symbolic nature may be incorporated into such systems and how simulation studies may be performed. The techniques we develop are implemented in the computer language Mathematica and an outline of how this may be accomplished is presented

    The EU Settled Status (Wales) data linkage project: Initial findings relating to health and education

    Get PDF
    Objectives Funded by ADR UK (ESRC) from 2020 to 2023, the project aims to anonymously link European Union Settlement Scheme (EUSS) data with other data already held within the SAIL Databank, based at Swansea University, and produce a research-ready dataset that can be used by researchers to obtain policy-relevant findings. Method Using a range of de-identified data in the SAIL Databank, a control group of British citizens in Wales has been matched with EU citizens with similar characteristics using the Census 2011 as a spine to identify country of birth. The SQL coding programme is used to link Census data with several other datasets in the SAIL Databank relating to health, with a focus on mental health, education and employment. The R software package is used for statistical analysis to produce comparisons between the groups and performing significance tests of these comparisons (e.g. Mann Whitney). Results Initial findings indicate small but statistically significant differences in school attendance between British pupils in Wales and pupils from EU14 and EU8 countries and similar differences in school attainment between British pupils in Wales and pupils from EU14 and EU8 countries. Further analysis has been conducted to explore differences between pupils from EU14 and EU8 countries. Census data are also linked to GP attendance data to explore differences in mental health related consultations and referrals for British citizens in Wales and citizens from EU countries and differences between citizens from EU14 and EU8 countries. Detailed findings from this linking of datasets and analysis will be presented. Conclusion Linking data in this way helps to gain a better understanding of the experiences and outcomes of EU citizens in Wales, generating better evidence to help inform policies and services that address the needs of this population and offers a dataset of great interest to academics

    Clinical coding of long Covid in Wales: A cohort study of 3.5 million people using linked health and demographic data

    Get PDF
    Objectives ‘Long COVID’ (LC) is broadly defined as signs and symptoms that continue or develop after the acute phase of COVID-19, and can affect cardiovascular, respiratory and other organ systems. Using electronic health records, we investigated clinical coding of LC in primary and secondary care for the population of Wales. Methods We conducted a cohort study for the population of Wales, using anonymised individual-level linked data in the Secure Anonymised Information Linkage (SAIL) Databank. We used the Welsh COVID-19 e-cohort (doi:10.1136/bmjopen-2020-043010), which consists of all people (adults and children) alive and resident in Wales from 1st January 2020. To this e-cohort we linked primary and secondary care, COVID-19 testing, and ethnic group data. We then calculated the proportion of people with a LC diagnosis code (in primary and secondary care data) overall and stratified by demographic variables. Results Of 3.5m residents, 7,696 (0.2%) had a LC clinical diagnosis. Compared with the general population, a higher proportion of people with LC were female, middle age, white, and hospitalised within 28 days of a confirmed COVID-19 infection. LC affected all socioeconomic groups, as assessed using the Welsh Index of Multiple Deprivation. When looking at LC diagnosis codes in primary care, 30.9% of practices in SAIL have not used these codes at all. And the number of recorded events was low until the end of January 2021, after which there was an increase in coding. These findings are likely a substantial underestimate of LC prevalence in Wales. Earlier estimates from self-reported surveys, such as the Office for National Statistics, are much higher, ranging anywhere between 3-5%. Conclusion Low recording rates of LC and variation between practices could be due to a delay in introducing clinical coding and lack of presentation/recording. Understanding prevalence of LC is vital for addressing the scale of the problem. Therefore developing additional data-driven approaches is necessary to obtain an accurate prevalence estimate

    The use of sewage treatment works as foraging sites by insectivorous bats

    Get PDF
    Sewage treatment works with percolating filter beds are known to provide profitable foraging areas for insectivorous birds due to their association with high macroinvertebrate densities. Fly larvae developing on filter beds at sewage treatment works may similarly provide a valuable resource for foraging bats. Over the last two decades, however, there has been a decline in filter beds towards a system of “activated sludge”. Insects and bat activity were surveyed at 30 sites in Scotland employing these two different types of sewage treatment in order to assess the possible implications of these changes for foraging bats. Bat activity (number of passes) recorded from broad-band bat detectors was quantified at three points within each site. The biomass of aerial insects, sampled over the same period as the detector surveys, was measured using a suction trap. The biomass of insects and activity of Pipistrellus spp. was significantly higher at filter beds than at activated sludge sites. In addition, whilst foraging activity of Pipistrellus spp. at filter beds was comparable to that of adjacent “good” foraging habitat, foraging at activated sludge sites was considerably lower. This study indicates the high potential value of an anthropogenic process to foraging bats, particularly in a landscape where their insect prey has undergone a marked decline, and suggests that the current preference for activated sludge systems is likely to reduce the value of treatment works as foraging sites for bats

    Model Structure Identification from Experimental Data

    Get PDF
    Methods for identifying the structure of dynamic mathematical models for water quality by reference to experimental field data are discussed. The context of the problem of model structure identification is described by briefly reviewing the steps involved in the overall process of system identification. These steps include experimental design; choice of model type; model structure identification; parameter estimation; and verification/validation. Two examples of approaches to solving the problem of model structure identification are presented. The first example is concerned with identifying the structure of a black box (input/output) model for the variations of gas production in the anaerobic digestion process of wastewater treatment. The second example addresses the more difficult problem of identifying the structure of an internally descriptive ("mechanistic") model form
    corecore