90 research outputs found

    A Metadata Manifesto: The Need for Global Health Metadata

    Get PDF
    Administrative health data recorded for individual health episodes (such as births, deaths, physician visits, and hospital stays) are being widely used to study policy-relevant scientific questions about population health, health services, and quality of care. Furthermore, an increasing number of international health comparisons are being undertaken with these data. An essential pre-requisite to such international comparative work is a detailed characterization of existing international health data resources, so that they can be more readily used in comparison studies across counties. A major challenge to such international comparative work is the variability across countries in the extent, content, and validity of existing administrative data holdings. Recognizing this, we have undertaken an international pilot process of compiling detailed data about data – i.e., a “meta-data catalogue” – for existing international administrative health data holdings. The methodological process for collecting these meta-data is described here, along with some general descriptive results for selected countries included in the pilot

    Exploration of association rule mining for coding consistency and completeness assessment in inpatient administrative health data.

    Get PDF
    OBJECTIVE: Data quality assessment is a challenging facet for research using coded administrative health data. Current assessment approaches are time and resource intensive. We explored whether association rule mining (ARM) can be used to develop rules for assessing data quality. MATERIALS AND METHODS: We extracted 2013 and 2014 records from the hospital discharge abstract database (DAD) for patients between the ages of 55 and 65 from five acute care hospitals in Alberta, Canada. The ARM was conducted using the 2013 DAD to extract rules with support ≥0.0019 and confidence ≥0.5 using the bootstrap technique, and tested in the 2014 DAD. The rules were compared against the method of coding frequency and assessed for their ability to detect error introduced by two kinds of data manipulation: random permutation and random deletion. RESULTS: The association rules generally had clear clinical meanings. Comparing 2014 data to 2013 data (both original), there were 3 rules with a confidence difference >0.1, while coding frequency difference of codes in the right hand of rules was less than 0.004. After random permutation of 50% of codes in the 2014 data, average rule confidence dropped from 0.72 to 0.27 while coding frequency remained unchanged. Rule confidence decreased with the increase of coding deletion, as expected. Rule confidence was more sensitive to code deletion compared to coding frequency, with slope of change ranging from 1.7 to 184.9 with a median of 9.1. CONCLUSION: The ARM is a promising technique to assess data quality. It offers a systematic way to derive coding association rules hidden in data, and potentially provides a sensitive and efficient method of assessing data quality compared to standard methods

    Data on coding association rules from an inpatient administrative health data coded by International classification of disease - 10th revision (ICD-10) codes

    Get PDF
    Data presented in this article relates to the research article entitled “Exploration of association rule mining for coding consistency and completeness assessment in inpatient administrative health data” (Peng et al. [1]) in preparation). We provided a set of ICD-10 coding association rules in the age group of 55 to 65. The rules were extracted from an inpatient administrative health data at five acute care hospitals in Alberta, Canada, using association rule mining. Thresholds of support and confidence for the association rules mining process were set at 0.19% and 50% respectively. The data set contains 426 rules, in which 86 rules are not nested. Data are provided in the supplementary material. The presented coding association rules provide a reference for future researches on the use of association rule mining for data quality assessment

    An administrative data merging solution for dealing with missing data in a clinical registry: adaptation from ICD-9 to ICD-10

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We have previously described a method for dealing with missing data in a prospective cardiac registry initiative. The method involves merging registry data to corresponding ICD-9-CM administrative data to fill in missing data 'holes'. Here, we describe the process of translating our data merging solution to ICD-10, and then validating its performance.</p> <p>Methods</p> <p>A multi-step translation process was undertaken to produce an ICD-10 algorithm, and merging was then implemented to produce complete datasets for 1995–2001 based on the ICD-9-CM coding algorithm, and for 2002–2005 based on the ICD-10 algorithm. We used cardiac registry data for patients undergoing cardiac catheterization in fiscal years 1995–2005. The corresponding administrative data records were coded in ICD-9-CM for 1995–2001 and in ICD-10 for 2002–2005. The resulting datasets were then evaluated for their ability to predict death at one year.</p> <p>Results</p> <p>The prevalence of the individual clinical risk factors increased gradually across years. There was, however, no evidence of either an abrupt drop or rise in prevalence of any of the risk factors. The performance of the new data merging model was comparable to that of our previously reported methodology: c-statistic = 0.788 (95% CI 0.775, 0.802) for the ICD-10 model versus c-statistic = 0.784 (95% CI 0.780, 0.790) for the ICD-9-CM model. The two models also exhibited similar goodness-of-fit.</p> <p>Conclusion</p> <p>The ICD-10 implementation of our data merging method performs as well as the previously-validated ICD-9-CM method. Such methodological research is an essential prerequisite for research with administrative data now that most health systems are transitioning to ICD-10.</p

    Child health insurance coverage: a survey among temporary and permanent residents in Shanghai

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Under the current healthcare system in China, there is no government-sponsored health insurance program for children. Children from families who move from rural and interior regions to large urban centres without a valid residency permit might be at higher risk of being uninsured due to their low socioeconomic status. We conducted a survey in Shanghai to describe children's health insurance coverage according to their migration status.</p> <p>Method</p> <p>Between 2005 and 2006, we conducted an in-person health survey of the adult care-givers of children aged 7 and under, residing in five districts of Shanghai. We compared uninsurance rates between temporary and permanent child residents, and investigated factors associated with child health uninsurance.</p> <p>Results</p> <p>Even though cooperative insurance eligibility has been extended to temporary residents of Shanghai, the uninsurance rate was significantly higher among temporary (65.6%) than permanent child residents (21.1%, adjusted odds ratio (OR): 5.85, 95% confidence interval (95% CI): 4.62–7.41). For both groups, family income was associated with having child health insurance; children in lower income families were more likely to be uninsured (OR: 1.96, 95% CI: 1.40–2.96).</p> <p>Conclusion</p> <p>Children must rely on their parents to make the insurance purchase decision, which is constrained by their income and the perceived benefits of the insurance program. Children from migrant families are at even higher risk for uninsurance due to their lower socioeconomic status. Government initiatives specifically targeting temporary residents and providing health insurance benefits for their children are urgently needed.</p

    Do coder characteristics influence validity of ICD-10 hospital discharge data?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Administrative data are widely used to study health systems and make important health policy decisions. Yet little is known about the influence of coder characteristics on administrative data validity in these studies. Our goal was to describe the relationship between several measures of validity in coded hospital discharge data and 1) coders' volume of coding (≥13,000 vs. <13,000 records), 2) coders' employment status (full- vs. part-time), and 3) hospital type.</p> <p>Methods</p> <p>This descriptive study examined 6 indicators of face validity in ICD-10 coded discharge records from 4 hospitals in Calgary, Canada between April 2002 and March 2007. Specifically, mean number of coded diagnoses, procedures, complications, Z-codes, and codes ending in 8 or 9 were compared by coding volume and employment status, as well as hospital type. The mean number of diagnoses was also compared across coder characteristics for 6 major conditions of varying complexity. Next, kappa statistics were computed to assess agreement between discharge data and linked chart data reabstracted by nursing chart reviewers. Kappas were compared across coder characteristics.</p> <p>Results</p> <p>422,618 discharge records were coded by 59 coders during the study period. The mean number of diagnoses per record decreased from 5.2 in 2002/2003 to 3.9 in 2006/2007, while the number of records coded annually increased from 69,613 to 102,842. Coders at the tertiary hospital coded the most diagnoses (5.0 compared with 3.9 and 3.8 at other sites). There was no variation by coder or site characteristics for any other face validity indicator. The mean number of diagnoses increased from 1.5 to 7.9 with increasing complexity of the major diagnosis, but did not vary with coder characteristics. Agreement (kappa) between coded data and chart review did not show any consistent pattern with respect to coder characteristics.</p> <p>Conclusions</p> <p>This large study suggests that coder characteristics do not influence the validity of hospital discharge data. Other jurisdictions might benefit from implementing similar employment programs to ours, e.g.: a requirement for a 2-year college training program, a single management structure across sites, and rotation of coders between sites. Limitations include few coder characteristics available for study due to privacy concerns.</p

    Improved accuracy of co-morbidity coding over time after the introduction of ICD-10 administrative data

    Get PDF
    BACKGROUND: Co-morbidity information derived from administrative data needs to be validated to allow its regular use. We assessed evolution in the accuracy of coding for Charlson and Elixhauser co-morbidities at three time points over a 5-year period, following the introduction of the International Classification of Diseases, 10th Revision (ICD-10), coding of hospital discharges.METHODS: Cross-sectional time trend evaluation study of coding accuracy using hospital chart data of 3'499 randomly selected patients who were discharged in 1999, 2001 and 2003, from two teaching and one non-teaching hospital in Switzerland. We measured sensitivity, positive predictive and Kappa values for agreement between administrative data coded with ICD-10 and chart data as the 'reference standard' for recording 36 co-morbidities.RESULTS: For the 17 the Charlson co-morbidities, the sensitivity - median (min-max) - was 36.5% (17.4-64.1) in 1999, 42.5% (22.2-64.6) in 2001 and 42.8% (8.4-75.6) in 2003. For the 29 Elixhauser co-morbidities, the sensitivity was 34.2% (1.9-64.1) in 1999, 38.6% (10.5-66.5) in 2001 and 41.6% (5.1-76.5) in 2003. Between 1999 and 2003, sensitivity estimates increased for 30 co-morbidities and decreased for 6 co-morbidities. The increase in sensitivities was statistically significant for six conditions and the decrease significant for one. Kappa values were increased for 29 co-morbidities and decreased for seven.CONCLUSIONS: Accuracy of administrative data in recording clinical conditions improved slightly between 1999 and 2003. These findings are of relevance to all jurisdictions introducing new coding systems, because they demonstrate a phenomenon of improved administrative data accuracy that may relate to a coding 'learning curve' with the new coding system

    Ethnic and sex differences in the incidence of hospitalized acute myocardial infarction: British Columbia, Canada 1995-2002

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As populations in Western countries continue to change in their ethnic composition, there is a need for regular surveillance of diseases that have previously shown some health disparities. Earlier data have already demonstrated high rates of cardiovascular mortality among South Asians and relatively lower rates among people of Chinese descent. The aim of this study was to describe the differences in the incidence of hospitalized acute myocardial infarction (AMI) among the three largest ethnic groups in British Columbia (BC), Canada.</p> <p>Methods</p> <p>Using hospital administrative data, we identified all patients with incident AMI in BC between April 1, 1995, and March 31, 2002. Census data from 2001 provided the denominator for the entire BC population. Ethnicity was determined using validated surname analysis and applied to the census and hospital administrative datasets. Direct age standardization was used to compare incidence rates.</p> <p>Results</p> <p>A total of 34,848 AMI cases were identified. Among men, South Asians had the highest age standardized rate of AMI hospitalization at 4.97/1000 population/year, followed by Whites at 3.29, and then Chinese at 0.98. Young South Asian men, in particular, showed incidence rates that were double that of young Whites and ten times that of young Chinese men. South Asian women also had the highest age-standardized rate of AMI hospitalization at 2.35/1000 population/year, followed by White women (1.53) and Chinese women (0.49).</p> <p>Conclusions</p> <p>South Asians continue to have a higher incidence of hospitalized AMI while incidence rates among Chinese remain low. Ethnic differences are most notable among younger men.</p
    corecore