90 research outputs found

    Assessing the quality of data on international migration flows in Europe: the case of undercounting

    Get PDF
    Undercounting is a critical issue in migration statistics, resulting in bias. It typically arises from insufficient reporting requirements and problems with enforcing such requirements. The main sources of information on undercounting are the metadata accompanying official statistics and expert opinions. However, metadata and arbitrary expert opinions may be limited by overlooking important details in migration data shared by various countries. This includes potential oversight of changes in methodologies, definitions, or retrospective updates to the data following censuses. This work presents a methodological solution with three objectives to address undercounting in international migration data. First, we provide an overview of available metadata and expert opinions on undercounting in European migration flows. Second, we propose a novel data-driven approach that incorporates year-specific and duration-of-stay-adjusted classifications. The proposed methodological solution relies on comparisons of flows in the same direction reported by a given country with high-quality data reported by another set of countries. We use bilateral migration data provided by Eurostat, UN and selected national statistical institutes. Duration-of-stay correction coefficients are derived through an optimization model or borrowed from the literature. Metadata and expert opinion scores can also be integrated to classify undercounting. Finally, we provide a dynamic classification of undercounting for 32 European countries (2002-2019), accessible through an online Shiny application, offering flexibility and adaptability. The findings highlight significant undercounting in new EU member states, particularly Bulgaria, Latvia, and Romania. Interestingly, other European countries, including those presumed to maintain reliable population statistics, also exhibit notable periods of undercounting

    Integrating Traditional and Social Media Data to Predict Bilateral Migrant Stocks in the European Union

    Get PDF
    Although up-to-date information on the nature and extent of migration within the European Union (EU) is important for policymaking, timely and reliable statistics on the number of EU citizens residing in or moving across other member states are difficult to obtain. In this paper, we develop a statistical model that integrates data on EU migrant stocks using traditional sources such as census, population registers and Labour Force Survey, with novel data sources, primarily from the Facebook Advertising Platform. Findings suggest that combining different data sources provides near real-time estimates that can serve as early warnings about shifts in EU mobility patterns. Estimated migrant stocks match relatively well to the observed data, despite some overestimation of smaller migrant populations and underestimation for larger migrant populations in Germany and the United Kingdom. In addition, the model estimates missing stocks for migrant corridors and years where no data are available, offering timely now-casted estimates

    Hospital length of stay for COVID-19 patients: Data-driven methods for forward planning.

    Get PDF
    From Europe PMC via Jisc Publications RouterHistory: ppub 2021-07-01, epub 2021-07-22Publication status: PublishedFunder: Medical Research Council; Grant(s): MR/R502236/1Funder: Royal Society; Grant(s): 202562/Z/16/Z, INF/R2/180067BackgroundPredicting hospital length of stay (LoS) for patients with COVID-19 infection is essential to ensure that adequate bed capacity can be provided without unnecessarily restricting care for patients with other conditions. Here, we demonstrate the utility of three complementary methods for predicting LoS using UK national- and hospital-level data.MethodOn a national scale, relevant patients were identified from the COVID-19 Hospitalisation in England Surveillance System (CHESS) reports. An Accelerated Failure Time (AFT) survival model and a truncation corrected method (TC), both with underlying Weibull distributions, were fitted to the data to estimate LoS from hospital admission date to an outcome (death or discharge) and from hospital admission date to Intensive Care Unit (ICU) admission date. In a second approach we fit a multi-state (MS) survival model to data directly from the Manchester University NHS Foundation Trust (MFT). We develop a planning tool that uses LoS estimates from these models to predict bed occupancy.ResultsAll methods produced similar overall estimates of LoS for overall hospital stay, given a patient is not admitted to ICU (8.4, 9.1 and 8.0 days for AFT, TC and MS, respectively). Estimates differ more significantly between the local and national level when considering ICU. National estimates for ICU LoS from AFT and TC were 12.4 and 13.4 days, whereas in local data the MS method produced estimates of 18.9 days.ConclusionsGiven the complexity and partiality of different data sources and the rapidly evolving nature of the COVID-19 pandemic, it is most appropriate to use multiple analysis methods on multiple datasets. The AFT method accounts for censored cases, but does not allow for simultaneous consideration of different outcomes. The TC method does not include censored cases, instead correcting for truncation in the data, but does consider these different outcomes. The MS method can model complex pathways to different outcomes whilst accounting for censoring, but cannot handle non-random case missingness. Overall, we conclude that data-driven modelling approaches of LoS using these methods is useful in epidemic planning and management, and should be considered for widespread adoption throughout healthcare systems internationally where similar data resources exist

    Assessing time series models for forecasting international migration : lessons from the United Kingdom

    Get PDF
    Funding: This work was funded by the Migration Advisory Committee (MAC), UK Home Office, under the Home Office Science contract HOS/14/040, and also supported by the ESRC Centre for Population Change grant ES/K007394/1.Migration is one of the most unpredictable demographic processes. The aim of this article is to provide a blueprint for assessing various possible forecasting approaches in order to help safeguard producers and users of official migration statistics against misguided forecasts. To achieve that, we first evaluate the various existing approaches to modelling and forecasting of international migration flows. Subsequently, we present an empirical comparison of ex post performance of various forecasting methods, applied to international migration to and from the United Kingdom. The overarching goal is to assess the uncertainty of forecasts produced by using different forecasting methods, both in terms of their errors (biases) and calibration of uncertainty. The empirical assessment, comparing the results of various forecasting models against past migration estimates, confirms the intuition about weak predictability of migration, but also highlights varying levels of forecast errors for different migration streams. There is no single forecasting approach that would be well suited for different flows. We therefore recommend adopting a tailored approach to forecasts, and applying a risk management framework to their results, taking into account the levels of uncertainty of the individual flows, as well as the differences in their potential societal impact.Publisher PDFPeer reviewe

    Model confidence sets and forecast combination: an application to age-specific mortality

    Get PDF
    Background: Model averaging combines forecasts obtained from a range of models, and it often produces more accurate forecasts than a forecast from a single model. Objective: The crucial part of forecast accuracy improvement in using the model averaging lies in the determination of optimal weights from a finite sample. If the weights are selected sub-optimally, this can affect the accuracy of the model-averaged forecasts. Instead of choosing the optimal weights, we consider trimming a set of models before equally averaging forecasts from the selected superior models. Motivated by Hansen et al. (2011), we apply and evaluate the model confidence set procedure when combining mortality forecasts. Data & Methods: The proposed model averaging procedure is motivated by Samuels and Sekkel (2017) based on the concept of model confidence sets as proposed by Hansen et al. (2011) that incorporates the statistical significance of the forecasting performance. As the model confidence level increases, the set of superior models generally decreases. The proposed model averaging procedure is demonstrated via national and sub-national Japanese mortality for retirement ages between 60 and 100+. Results: Illustrated by national and sub-national Japanese mortality for ages between 60 and 100+, the proposed model-average procedure gives the smallest interval forecast errors, especially for males. Conclusion: We find that robust out-of-sample point and interval forecasts may be obtained from the trimming method. By robust, we mean robustness against model misspecification

    Inferring transient dynamics of human populations from matrix non-normality

    Get PDF
    This is the final version of the article. Available from Springer Verlag via the DOI in this record.In our increasingly unstable and unpredictable world, population dynamics rarely settle uniformly to long-term behaviour. However, projecting period-by-period through the preceding fluctuations is more data-intensive and analytically involved than evaluating at equilibrium. To efficiently model populations and best inform policy, we require pragmatic suggestions as to when it is necessary to incorporate short-term transient dynamics and their effect on eventual projected population size. To estimate this need for matrix population modelling, we adopt a linear algebraic quantity known as non-normality. Matrix non-normality is distinct from normality in the Gaussian sense, and indicates the amplificatory potential of the population projection matrix given a particular population vector. In this paper, we compare and contrast three well-regarded metrics of non-normality, which were calculated for over 1000 age-structured human population projection matrices from 42 European countries in the period 1960 to 2014. Non-normality increased over time, mirroring the indices of transient dynamics that peaked around the millennium. By standardising the matrices to focus on transient dynamics and not changes in the asymptotic growth rate, we show that the damping ratio is an uninformative predictor of whether a population is prone to transient booms or busts in its size. These analyses suggest that population ecology approaches to inferring transient dynamics have too often relied on suboptimal analytical tools focussed on an initial population vector rather than the capacity of the life cycle to amplify or dampen transient fluctuations. Finally, we introduce the engineering technique of pseudospectra analysis to population ecology, which, like matrix non-normality, provides a more complete description of the transient fluctuations than the damping ratio. Pseudospectra analysis could further support non-normality assessment to enable a greater understanding of when we might expect transient phases to impact eventual population dynamics.This work was funded by Wellcome Trust New Investigator 103780 to TE, who is also funded by NERC Fellowship NE/J018163/1. JB gratefully acknowledges the ESRC Centre for Population Change ES/K007394/1

    Data for: Hierarchical model for forecasting the outcomes of binary referenda

    No full text
    The archive file contains data on opinion polls on Scottish independence referendum (2014) and EU membership referendum (2016). It also contains R code, JAGS model and OpenBUGS code used to estimate the model parameters and forecasts of the outcome of the referendum
    corecore