66 research outputs found

    The selection of covariates for the relationship between blood-lead and ability

    Get PDF
    This thesis arose from a problem in the analysis of data from the Edinburgh Lead Study. The data were to be used to estimate the influence of children's blood lead levels on their mental abilities, controlling for other factors which might confound this relationship. The other factors were summarised as a set of covariate scores, and the question arose as to which of these scores should be included in a multiple regression whose purpose was to estimate the coefficient of blood-lead. This problem has arisen in other studies of the influence of lead on ability, and a variety of solutions have been implemented. The statistical and epidemiological literature offers little guidance.The problem is formalised by proposing regression models with various assumptions. Expressions are derived for the mean-square-error of the parameter of special interest (here the blood-lead coefficient) in terms of quantities which can be calculated from the data. Various stepwise procedures are proposed for selecting a sub-set of covariates to include in the regression equation. These include the usual stepwise procedures, as well as new ones based on the various meansquare-error criteria and on changes in the coefficient of interest. These procedures are studied for the data from the Edinburgh Lead Study and evaluated by simulation in different ways.The potential for variance reduction from sub-models, compared to including all covariates, is a function of the multiple correlation between the variable of special interest and the variables which could be omitted from the model. The results suggest that, unless this correlation exceeds 0.2, inferences should be based on a regression with the full set of covariates. The greatest benefit is obtained from sub-set selection procedures when the multiple correlation is increased as a result of a decrease in the residual degrees of freedom. In these circumstances the multiple correlation will be high, but its value will fall when the usual adjustment for degrees of freedom is applied. The simulation results suggest that sub-set selection will be beneficial when the residual degrees of freedom for the full model are less than three time the number of covariates.The method which performed best was to select, at each step, the variable which made the largest change in the coefficient of interest. Stopping rules for this criterion are propped. This method was less prone than the other methods to underestimate the variance of the coefficient of interest, when this is evaluated in the usual way for the final model. But it performed badly and underestimated this variance, for artificial data where the population multiple correlation between the variable of special interest and the covariates was high. This suggests that sub-set selection should not be used when the estimated multiple correlation adjusted for degrees of freedom is high.These criteria applied to the Lead Study data would suggest that the effect of lead on ability should be assessed by adjusting for all the covariate scores

    Methods to control disclosure risk of synthetic data created by National Statistics Agencies

    Get PDF
    Objectives With the recent explosion of interest in using synthetic data (SD) for disclosure control many NSAs are releasing, or considering releasing. synthetic versions of their administrative data. This presentation will review the methods that NSAs can use to limit the disclosure risk of any planned release of synthetic data. Methods This paper will review the ways in which methods of creating can be adapted to control the disclosure risk that could arise by the release of such data either to trusted researchers or to a wider group. Methods that will be evaluated will include: • The use of Statistical Disclosure Control (SDC) methods on the synthetic data before its release • Selecting methods producing low fidelity synthetic data • Adapting the synthesis method until it satisfies measures of disclosure risk • Incoporating differential privacy (DP) into the method of creating synthetic data Results NSAs can use different methods to create SD based on real data (RD); see e.g. https://unece.org/info/publications/pub/373531. Tthe disclosure risk of SD depends on the context of its release, to whom, in what environment etc. Even if the planned method of release ensures low disclosure risk, NSAs will want to know what the disclosure risk might be if the SD got into the wrong hands. The SD can reveal that an identified person is in the RD (identity disclosure) or can disclose information about other measures for an individual that are part of the RD. Measures of identity disclosure and attribute disclosure are described. Results will be presented on the disclosure risk of examples of SD created for real examples by the methods 1 to 4. Conclusion Each of the methods 1 to 4 have strengths and weaknesses. Methods 2 and 4 will be ruled out for many applications because of poor fidelity to the RD. A practical way forward is suggested by combining methods 1 and 3

    Creating a longitudinal dataset of care experienced children in Scotland – Administrative Data Research Scotland.

    Get PDF
    To create a dataset which describes the care experience of children in Scotland data.  To share with analysts in a safe setting to allow linkage, and provide information on the strengths and weaknesses. To gather feedback on its use to inform future collections and improve experience of future users. The dataset was created by combining 11 years of data provided to Scottish Government by local authorities.  The resulting longitudinal dataset includes details of the children, the periods of care including the type of care setting, along with the legal basis for this care, and information on the destination of the child following care.  Detailed information about the dataset is provided along with a background document in how the dataset was constructed.  Data quality flags are provided to highlight situations where there may be inconsistences, and code is shared to create a cleaned dataset for analysis. A longitudinal dataset has been created covering the period 2009 to 2019.  This covers almost 60,000 children with details of the care setting and legal basis.   The dataset has been indexed to a population spine, which enables it to be linked to other data.  Additional derived variables have been added, and improvements made to data quality where possible. In other situations data quality is  highlighted with guidance provided on how to deal with these issues.  The results have been shared extensively with the data providers, and previous users providing valuable input.  This has resulted in improved understanding of the data, and informed the practice of gathering data in future years, and plans for official statistics  on health outcomes of care experienced children. This project has created an enduring resource which will improve evidence for policy making by allowing analysis to consider patterns of care experience and link it to outcomes.  This is particularly relevant given a review of care experience in Scotland, and the commitment by Scottish Government to keep the Promise

    Neighbourhood ethnic mix and the formation of mixed-ethnic unions in Britain : a longitudinal analysis

    Get PDF
    This research is funded by the ESRC under the Understanding Population Trends and Processes (UPT AP) programme (Award Ref: RE S-163-25-0045).Although developed societies are becoming increasingly ethnically diverse, relatively little research has been conducted on geographies of mixed-ethnic unions (married or cohabiting). There is some recent evidence from the US that mixed-ethnic couples are more likely to be found in mixed-ethnic neighbourhoods, but this research is based on cross-sectional data. Therefore it is not possible to determine whether mixed-ethnic couples are more likely to form in mixed-ethnic neighbourhoods or whether they are more likely to move there. Our longitudinal analysis allows us to tease out the relative importance of these two processes, furthering our understanding of the formation of mixed-ethnic unions. Using data from the Office for National Statistics Longitudinal Study we examine neighbourhood effects on the formation of mixed-ethnic unions in England and Wales. We find that mixed-ethnic unions are more likely to form in neighbourhoods with low concentrations of co-ethnic population. The results from this study lend support to the contact theory that geographical proximity to other ethnic groups enhances mutual understanding between people from different ethnic groups and could lead to the development of intimate partnerships.PostprintPeer reviewe

    Infants born into care in Scotland: Initial Findings

    Get PDF
    This report is one of the first outputs that uses linked data from the Looked after Children in Scotland data (LAC-S) to examine looked after children’s journeys. These data were made available to the research team in a secure environment that protects the privacy of all subjects. The report describes the patterns of care for infants who first became looked after in Scotland when under 1 year of age between 1st April 2008 and 31st July 2017. It includes details of all episodes of care up to the end of follow-up (31st July 2017)

    Outcomes for children and young people growing up in kinship care in Scotland: A population-level data linkage study

    Get PDF
    Objectives The number of children and young people living in kinship care (living separately from their parents with a family friend/relative) in Scotland has increased consistently since 2010. This study aims to explore outcomes for children and young people in kinship care in terms of their health, education, and care journeys. Methods This population-wide data linkage study will utilise data on all children and young people who have been placed in formal kinship care from 2008 onwards. The datasets linked for this study are: • Looked After Children • Child Protection • Health Visiting • Educational attainment, attendances, absences and exclusion • Children’s Hearings data Through descriptive analysis the study will provide insight into the experiences and outcomes of children and young people in kinship care. Where appropriate, comparisons will be drawn with the wider care population and the general population of children and young people in Scotland. Results Data will be presented which highlight: • Levels of kinship care usage in Scotland, and how this varies regionally and over time. • The routes into kinship care in terms of child protection concerns and legal reasons data. • How children fare while in kinship care – specifically in terms of their childhood development, placement stability and educational attainment/engagement. • The pathways out of formal kinship care, and whether these differ over time. Full results will be available for publication at the 2023 ADR UK conference. Conclusion This is the first population-wide study on outcomes for children and young people living in kinship care within Scotland. It is hoped that insight arising from this work will aid local authorities, government and others in meeting the needs of the steadily increasing number of children living in formal kinship care

    synthpop: Bespoke Creation of Synthetic Data in R

    Get PDF
    In many contexts, confidentiality constraints severely restrict access to unique and valuable microdata. Synthetic data which mimic the original observed data and preserve the relationships between variables but do not contain any disclosive records are one possible solution to this problem. The synthpop package for R, introduced in this paper, provides routines to generate synthetic versions of original data sets. We describe the methodology and its consequences for the data characteristics. We illustrate the package features using a survey data example

    The impact of the Covid-19 pandemic restrictions on children entering and leaving care in Scotland

    Get PDF
    Objectives The Covid-19 pandemic caused huge upheaval across the world, and the Scottish children’s social care sector was not exempt from this. This study explored the extent to which the Covid-19 pandemic restrictions impacted on the rate at which children and young people were entering and leaving care in Scotland. Methods Analysis was conducted on the Scottish Government’s ‘Longitudinal Looked After Children’ dataset which contains information on the care journeys of all children who were ‘looked after’ in Scotland between April 2008 and July 2021. Through a descriptive analysis, the study determined how the pandemic restrictions impacted the rates of children entering care, the rates of children leaving care, and the stability and duration of care placements at this time. Results During the initial year of the pandemic, there was a marked reduction in the number of children both entering (38%) and leaving (22%) care when compared to the year prior. The reduction in entries to care were less notable for infants under the age of 1 than it was for older age groups. Fewer children entered care under compulsory measures, with an increased proportion entering through ‘voluntary’ Section 25 measures. The impact on entries to care and exits from care varied substantially across local authorities in Scotland. The stability and duration of children’s placements were also impacted, with children moving placements less and staying in care for longer periods of time. Conclusion The pandemic restrictions had a substantial impact on the interactions of children and young people with the ‘care system’. At a time where there is great focus on improving the experience of young people in care within Scotland, it is important that lessons are learned and implemented for future disruptive events
    • …
    corecore