23 research outputs found

    Indirect determination of serum creatinine reference intervals in a Pakistani pediatric population using big data analytics

    Get PDF
    Background: The indirect methods of reference intervals (RI) establishment based on data mining are utilized to overcome the ethical, practical challenges and the cost associated with the conventional direct approach.Aim: To generate RIs for serum creatinine in children and adolescents using an indirect statistical tool.Methods: Data mining of the laboratory information system was performed for serum creatinine analyzed from birth to 17 years for both genders. The timeline was set at six years from January 2013 to December 2018. Microsoft Excel 2010 and an indirect algorithm developed by the German Society of Clinical Chemistry and Laboratory Medicine\u27s Working Group on Guide Limits were used for the data analysis.Results: Data were extracted from 96104 samples and after excluding multiple samples for the same individual, we calculated RIs for 21920 males and 14846 females, with stratification into six discrete age groups.Conclusion: Serum creatinine dynamics varied significantly across gender and age groups

    Reference Interval Estimation from Mixed Distributions using Truncation Points and the Kolmogorov-Smirnov Distance (kosmic)

    Get PDF
    Appropriate reference intervals are essential when using laboratory test results to guide medical decisions. Conventional approaches for the establishment of reference intervals rely on large samples from healthy and homogenous reference populations. However, this approach is associated with substantial financial and logistic challenges, subject to ethical restrictions in children, and limited in older individuals due to the high prevalence of chronic morbidities and medication. We implemented an indirect method for reference interval estimation, which uses mixed physiological and abnormal test results from clinical information systems, to overcome these restrictions. The algorithm minimizes the difference between an estimated parametrical distribution and a truncated part of the observed distribution, specifically, the Kolmogorov-Smirnov-distance between a hypothetical Gaussian distribution and the observed distribution of test results after Box-Cox-transformation. Simulations of common laboratory tests with increasing proportions of abnormal test results show reliable reference interval estimations even in challenging simulation scenarios, when <20% test results are abnormal. Additionally, reference intervals generated using samples from a university hospital’s laboratory information system, with a gradually increasing proportion of abnormal test results remained stable, even if samples from units with a substantial prevalence of pathologies were included. A high-performance open-source C++ implementation is available at https://gitlab.miracum.org/kosmic

    Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources

    Get PDF
    Background Medical decision making based on quantitative test results depends on reliable reference intervals, which represent the range of physiological test results in a healthy population. Current methods for the estimation of reference limits focus either on modelling the age-dependent dynamics of different analytes directly in a prospective setting or the extraction of independent distributions from contaminated data sources, e.g. data with latent heterogeneity due to unlabeled pathologic cases. In this article, we propose a new method to estimate indirect reference limits with non-linear dependencies on covariates from contaminated datasets by combining the framework of mixture models and distributional regression. Results Simulation results based on mixtures of Gaussian and gamma distributions suggest accurate approximation of the true quantiles that improves with increasing sample size and decreasing overlap between the mixture components. Due to the high flexibility of the framework, initialization of the algorithm requires careful considerations regarding appropriate starting weights. Estimated quantiles from the extracted distribution of healthy hemoglobin concentration in boys and girls provide clinically useful pediatric reference limits similar to solutions obtained using different approaches which require more samples and are computationally more expensive. Conclusions Latent class distributional regression models represent the first method to estimate indirect non-linear reference limits from a single model fit, but the general scope of applications can be extended to other scenarios with latent heterogeneity

    Mixture density networks for the indirect estimation of reference intervals

    Get PDF
    Background Reference intervals represent the expected range of physiological test results in a healthy population and are essential to support medical decision making. Particularly in the context of pediatric reference intervals, where recruitment regulations make prospective studies challenging to conduct, indirect estimation strategies are becoming increasingly important. Established indirect methods enable robust identification of the distribution of “healthy” samples from laboratory databases, which include unlabeled pathologic cases, but are currently severely limited when adjusting for essential patient characteristics such as age. Here, we propose the use of mixture density networks (MDN) to overcome this problem and model all parameters of the mixture distribution in a single step. Results Estimated reference intervals from varying settings with simulated data demonstrate the ability to accurately estimate latent distributions from unlabeled data using different implementations of MDNs. Comparing the performance with alternative estimation approaches further highlights the importance of modeling the mixture component weights as a function of the input in order to avoid biased estimates for all other parameters and the resulting reference intervals. We also provide a strategy to generate partially customized starting weights to improve proper identification of the latent components. Finally, the application on real-world hemoglobin samples provides results in line with current gold standard approaches, but also suggests further investigations with respect to adequate regularization strategies in order to prevent overfitting the data. Conclusions Mixture density networks provide a promising approach capable of extracting the distribution of healthy samples from unlabeled laboratory databases while simultaneously and explicitly estimating all parameters and component weights as non-linear functions of the covariate(s), thereby allowing the estimation of age-dependent reference intervals in a single step. Further studies on model regularization and asymmetric component distributions are warranted to consolidate our findings and expand the scope of applications

    refineR: A Novel Algorithm for Reference Interval Estimation from Real-World Data

    Get PDF
    Reference intervals are essential for the interpretation of laboratory test results in medicine. We propose a novel indirect approach to estimate reference intervals from real-world data as an alternative to direct methods, which require samples from healthy individuals. The presented refineR algorithm separates the non-pathological distribution from the pathological distribution of observed test results using an inverse approach and identifies the model that best explains the non-pathological distribution. To evaluate its performance, we simulated test results from six common laboratory analytes with a varying location and fraction of pathological test results. Estimated reference intervals were compared to the ground truth, an alternative indirect method (kosmic), and the direct method (N = 120 and N = 400 samples). Overall, refineR achieved the lowest mean percentage error of all methods (2.77%). Analyzing the amount of reference intervals within ± 1 total error deviation from the ground truth, refineR (82.5%) was inferior to the direct method with N = 400 samples (90.1%), but outperformed kosmic (70.8%) and the direct method with N = 120 (67.4%). Additionally, reference intervals estimated from pediatric data were comparable to published direct method studies. In conclusion, the refineR algorithm enables precise estimation of reference intervals from real-world data and represents a viable complement to the direct method

    Temporal evolution and differential patterns of cellular reconstitution after therapy for childhood cancers

    Get PDF
    AbstractThe cellular reconstitution after childhood cancer therapy is associated with the risk of infection and efficacy of revaccination. Many studies have described the reconstitution after stem cell transplantation (SCT). The recovery after cancer treatment in children who have not undergone SCT has mainly been investigated in acute lymphoblastic leukemia (ALL), less for solid tumors. Here, we have examined the temporal evolution of total leukocyte, neutrophil and lymphocyte counts as surrogate parameters for the post-therapeutic immune recovery in a cohort of n = 52 patients with ALL in comparison to n = 58 patients with Hodgkin’s disease (HD) and n = 22 patients with Ewing sarcoma (ES). Patients with ALL showed an efficient increase in blood counts reaching the age-adjusted lower limits of normal between 4 and 5 months after the end of maintenance therapy. The two groups of patients with HD and ES exhibited a comparably delayed recovery of total leukocytes due to a protracted post-therapeutic lymphopenia which was most pronounced in patients with HD after irradiation. Overall, we observed a clearly more efficient resurgence of total lymphocyte counts in patients aged below 12 years compared to patients aged 12 to 18 years. Our results underline that the kinetics of cellular reconstitution after therapy for HD and ES differ significantly from ALL and depend on treatment regimens and modalities as well as on patient age. This suggests a need for disease, treatment, and age specific recommendations concerning the duration of infection prophylaxis and the timing of revaccination.</jats:p

    KETOS: Clinical decision support and machine learning as a service – A training and deployment platform based on Docker, OMOP-CDM, and FHIR Web Services

    Get PDF
    Background and objective To take full advantage of decision support, machine learning, and patient-level prediction models, it is important that models are not only created, but also deployed in a clinical setting. The KETOS platform demonstrated in this work implements a tool for researchers allowing them to perform statistical analyses and deploy resulting models in a secure environment. Methods The proposed system uses Docker virtualization to provide researchers with reproducible data analysis and development environments, accessible via Jupyter Notebook, to perform statistical analysis and develop, train and deploy models based on standardized input data. The platform is built in a modular fashion and interfaces with web services using the Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) standard to access patient data. In our prototypical implementation we use an OMOP common data model (OMOP-CDM) database. The architecture supports the entire research lifecycle from creating a data analysis environment, retrieving data, and training to final deployment in a hospital setting. Results We evaluated the platform by establishing and deploying an analysis and end user application for hemoglobin reference intervals within the University Hospital Erlangen. To demonstrate the potential of the system to deploy arbitrary models, we loaded a colorectal cancer dataset into an OMOP database and built machine learning models to predict patient outcomes and made them available via a web service. We demonstrated both the integration with FHIR as well as an example end user application. Finally, we integrated the platform with the open source DataSHIELD architecture to allow for distributed privacy preserving data analysis and training across networks of hospitals. Conclusion The KETOS platform takes a novel approach to data analysis, training and deploying decision support models in a hospital or healthcare setting. It does so in a secure and privacy-preserving manner, combining the flexibility of Docker virtualization with the advantages of standardized vocabularies, a widely applied database schema (OMOP-CDM), and a standardized way to exchange medical data (FHIR)

    Indirekte Bestimmung pĂ€diatrischer Referenzintervalle fĂŒr das Blutbild

    No full text
    Background: Determination of pediatric reference intervals (RIs) for laboratory quantities, including hematological quantities, is complex. The measured quantities vary by age, and obtaining samples from healthy children is difficult. Many widely used RIs are derived from small sample numbers and are split into arbitrary discrete age intervals. Use of intra-laboratory RIs specific to the examined population and analytical device used is not yet fully established. Indirect methods address these issues by deriving RIs from clinical laboratory databases which contain large datasets of both healthy and pathological samples. Methods: A refined indirect approach was used to create continuous age-dependent RIs for blood count quantities and sodium from birth to adulthood. The dataset for each quantity consisted of 60,000 individual samples from our clinical laboratory. Patient samples were separated according to age, and a density function of the proportion of healthy samples was estimated for each age group. The resulting RIs were merged to obtain continuous RIs from birth to adulthood. Results: The obtained RIs were compared to RIs generated by identical laboratory instruments, and to population-specific RIs created using conventional methods. This comparison showed a high concordance of reference limits and their age-dependent dynamics. Conclusions: The indirect approach reported here is wellsuited to create continuous, intra-laboratory RIs from clinical laboratory databases and showed that the RIs generated are comparable to those created using established methods. The procedure can be transferred to other laboratory quantities and can be used as an alternative method for RI determination where conventional approaches are limited.Hintergrund: Die Bestimmung pĂ€diatrischer Referenzintervalle (RIs) fĂŒr LaborgrĂ¶ĂŸen einschließlich hĂ€matologischer GrĂ¶ĂŸen ist komplex. Die gemessenen Werte sind altersabhĂ€ngig und die Probengewinnung von gesunden Kindern ist problematisch. Viele verbreitete RIs basieren auf kleinen Stichprobenpopulationen und sind in arbitrĂ€re und diskrete Altersintervalle aufgeteilt. Die Verwendung von RIs, die zugleich labor-, analysemethoden- und populationsspezifisch sind, ist oft nicht möglich. Mit der Verwendung indirekter Methoden können RIs aus klinischen Labordatenbanken bestimmt werden, die große DatensĂ€tze pathologischer und nicht-pathologischer Messwerte enthalten. Methoden: Mit einer weiterentwickelten indirekten Methode wurden kontinuierliche altersabhĂ€ngige RIs fĂŒr Parameter des Blutbilds und Natrium von Geburt bis zum Erwachsenenalter bestimmt. FĂŒr jede MessgrĂ¶ĂŸe standen mehr als 60.000 Messwerte aus unserem Routinelabor zur VerfĂŒgung. Die Patientenmesswerte wurden altersabhĂ€ngig aufgeteilt und der Anteil nicht-pathologischer Messwerte mittels einer Dichtefunktion fĂŒr jede Altersgruppe geschĂ€tzt. Die RIs wurden zu kontinuierlichen altersabhĂ€ngigen RIs von Geburt bis Erwachsenenalter fusioniert. Ergebnisse: Die ermittelten RIs wurden mit Analysemethode-spezifischen RIs sowie mit populationsspezifischen RIs verglichen, die mit konventionellen Methoden generiert wurden. Dieser Vergleich zeigte eine hohe Übereinstimmung. Schlussfolgerungen: Der entwickelte indirekte Ansatz ist fĂŒr die Bestimmung kontinuierlicher, labor- und populationsspezifischer RIs aus Labordatenbanken gut geeignet. Die generierten RIs sind vergleichbar mit konventionell bestimmten. Die Methode kann auf andere Laborwerte ĂŒbertragen werden und ist eine Alternative zu konventionellen Methoden, insbesondere, wenn deren Anwendbarkeit eingeschrĂ€nkt ist

    Establishment of reference intervals for alkaline phosphatase in Pakistani children using a data mining approach

    No full text
    Objective: To establish reference intervals (RIs) for alkaline phosphatase (ALP) levels in Pakistani children using an indirect data mining approach.Methods: ALP levels analyzed on a Siemens Advia 1800 analyzer using the International Federation of Clinical Chemistry\u27s photometric method for both inpatients and outpatients aged 1 to 17 years between January 2013 and December 2017, including patients from intensive care units and specialty units, were retrieved. RIs were calculated using a previously validated indirect algorithm developed by the German Society of Clinical Chemistry and Laboratory Medicine\u27s Working Group on Guide Limits.Results: From a total of 108,845 results, after the exclusion of patients with multiple specimens, RIs were calculated for 24,628 males and 18,083 females with stratification into fine-grained age groups. These RIs demonstrate the complex age- and sex-related ALP dynamics occurring during physiological development.Conclusion: The population-specific RIs serve to allow an accurate understanding of the fluctuations in analyte activity with increasing age and to support clinical decision making
    corecore