236 research outputs found

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Body composition and body fat distribution are related to cardiac autonomic control in non-alcoholic fatty liver disease patients

    Get PDF
    BACKGROUND/OBJECTIVES: Heart rate recovery (HRR), a cardiac autonomic control marker, was shown to be related to body composition (BC), yet this was not tested in non-alcoholic fatty liver disease (NAFLD) patients. The aim of this study was to determine if, and to what extent, markers of BC and body fat (BF) distribution are related to cardiac autonomic control in NAFLD patients. SUBJECTS/METHODS: BC was assessed with dual-energy X-ray absorptiometry in 28 NAFLD patients (19 men, 51±13 years, and 9 women, 47±13 years). BF depots ratios were calculated to assess BF distribution. Subjects’ HRR was recorded 1 (HRR1) and 2 min (HRR2) immediately after a maximum graded exercise test. RESULTS: BC and BF distribution were related to HRR; particularly weight, trunk BF and trunk BF-to-appendicular BF ratio showed a negative relation with HRR1 (r 1⁄4 0.613, r 1⁄4 0.597 and r 1⁄4 0.547, respectively, Po0.01) and HRR2 (r 1⁄4 0.484, r 1⁄4 0.446, Po0.05, and r 1⁄4 0.590, Po0.01, respectively). Age seems to be related to both HRR1 and HRR2 except when controlled for BF distribution. The preferred model in multiple regression should include trunk BF-to-appendicular BF ratio and BF to predict HRR1 (r2 1⁄4 0.549; Po0.05), and trunk BF-to-appendicular BF ratio alone to predict HRR2 (r2 1⁄4 0.430; Po0.001). CONCLUSIONS: BC and BF distribution were related to HRR in NAFLD patients. Trunk BF-to-appendicular BF ratio was the best independent predictor of HRR and therefore may be best related to cardiovascular increased risk, and possibly act as a mediator in age-related cardiac autonomic control variation.info:eu-repo/semantics/publishedVersio

    Ambient-aware continuous care through semantic context dissemination

    Get PDF
    Background: The ultimate ambient-intelligent care room contains numerous sensors and devices to monitor the patient, sense and adjust the environment and support the staff. This sensor-based approach results in a large amount of data, which can be processed by current and future applications, e. g., task management and alerting systems. Today, nurses are responsible for coordinating all these applications and supplied information, which reduces the added value and slows down the adoption rate. The aim of the presented research is the design of a pervasive and scalable framework that is able to optimize continuous care processes by intelligently reasoning on the large amount of heterogeneous care data. Methods: The developed Ontology-based Care Platform (OCarePlatform) consists of modular components that perform a specific reasoning task. Consequently, they can easily be replicated and distributed. Complex reasoning is achieved by combining the results of different components. To ensure that the components only receive information, which is of interest to them at that time, they are able to dynamically generate and register filter rules with a Semantic Communication Bus (SCB). This SCB semantically filters all the heterogeneous care data according to the registered rules by using a continuous care ontology. The SCB can be distributed and a cache can be employed to ensure scalability. Results: A prototype implementation is presented consisting of a new-generation nurse call system supported by a localization and a home automation component. The amount of data that is filtered and the performance of the SCB are evaluated by testing the prototype in a living lab. The delay introduced by processing the filter rules is negligible when 10 or fewer rules are registered. Conclusions: The OCarePlatform allows disseminating relevant care data for the different applications and additionally supports composing complex applications from a set of smaller independent components. This way, the platform significantly reduces the amount of information that needs to be processed by the nurses. The delay resulting from processing the filter rules is linear in the amount of rules. Distributed deployment of the SCB and using a cache allows further improvement of these performance results

    Pica associated with iron deficiency or depletion: clinical and laboratory correlates in 262 non-pregnant adult outpatients

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>There are many descriptions of the association of pica with iron deficiency in adults, but there are few reports in which observations available at diagnosis of iron deficiency were analyzed using multivariable techniques to identify significant predictors of pica. We sought to identify clinical and laboratory correlates of pica in adults with iron deficiency or depletion using univariable and stepwise forward logistic regression analyses.</p> <p>Methods</p> <p>We reviewed charts of 262 non-pregnant adult outpatients (ages ≥18 y) who required treatment with intravenous iron dextran. We tabulated their sex, age, race/ethnicity, body mass index, symptoms and causes of iron deficiency or depletion, serum iron and complete blood count measures, and other conditions at diagnosis before intravenous iron dextran was administered. We excluded patients with serum creatinine >133 μmol/L or disorders that could affect erythrocyte or iron measures. Iron deficiency was defined as both SF <45 pmol/L and TS <10%. Iron depletion was defined as serum ferritin (SF) <112 pmol/L. We performed univariable comparisons and stepwise forward logistic regression analyses to identify significant correlates of pica.</p> <p>Results</p> <p>There were 230 women (184 white, 46 black; ages 19-91 y) and 32 men (31 white, 1 black; ages 24-81 y). 118 patients (45.0%) reported pica; of these, 87.3% reported ice pica (pagophagia). In univariable analyses, patients with pica had lower mean age, black race/ethnicity, and higher prevalences of cardiopulmonary and epithelial manifestations. The prevalence of iron deficiency, with or without anemia, did not differ significantly between patients with and without pica reports. Mean hemoglobin and mean corpuscular volume (MCV) were lower and mean red blood cell distribution width (RDW) and platelet count were higher in patients with pica. Thrombocytosis occurred only in women and was more prevalent in those with pica (20.4% vs. 8.3%; p = 0.0050). Mean total iron-binding capacity was higher and mean serum ferritin was lower in patients with pica. Nineteen patients developed a second episode of iron deficiency or depletion; concordance of recurrent pica (or absence of pica) was 95%. Predictors of pica in logistic regression analyses were age and MCV (negative associations; p = 0.0250 and 0.0018, respectively) and RDW and platelet count (positive associations; p = 0.0009 and 0.02215, respectively); the odds ratios of these predictors were low.</p> <p>Conclusions</p> <p>In non-pregnant adult patients with iron deficiency or depletion, lower age is a significant predictor of pica. Patients with pica have lower MCV, higher RDW, and higher platelet counts than patients without pica.</p

    Bioenergetic cues shift FXR splicing towards FXR alpha 2 to modulate hepatic lipolysis and fatty acid metabolism

    Get PDF
    Objective: Farnesoid X receptor (FXR) plays a prominent role in hepatic lipid metabolism. The FXR gene encodes four proteins with structural differences suggestive of discrete biological functions about which little is known. Methods: We expressed each FXR variant in primary hepatocytes and evaluated global gene expression, lipid profile, and metabolic fluxes. Gene delivery of FXR variants to Fxr(-/-) mouse liver was performed to evaluate their role in vivo. The effects of fasting and physical exercise on hepatic Fxr splicing were determined. Results: We show that FXR splice isoforms regulate largely different gene sets and have specific effects on hepatic metabolism. FXR alpha 2 (but not alpha 1) activates a broad transcriptional program in hepatocytes conducive to lipolysis, fatty acid oxidation, and ketogenesis. Consequently, FXR alpha 2 decreases cellular lipid accumulation and improves cellular insulin signaling to AKT. FXR alpha 2 expression in Fxr(-/-) mouse liver activates a similar gene program and robustly decreases hepatic triglyceride levels. On the other hand, FXRa1 reduces hepatic triglyceride content to a lesser extent and does so through regulation of lipogenic gene expression. Bioenergetic cues, such as fasting and exercise, dynamically regulate Fxr splicing in mouse liver to increase Fxr alpha 2 expression. Conclusions: Our results show that the main FXR variants in human liver (alpha 1 and alpha 2) reduce hepatic lipid accumulation through distinct mechanisms and to different degrees. Taking this novel mechanism into account could greatly improve the pharmacological targeting and therapeutic efficacy of FXR agonists. (C) 2015 The Authors. Published by Elsevier GmbH. This is an open access article under the CC BY-NC-ND license (http://creativecommons. org/licenses/by-nc-nd/4.0/).Novo Nordisk Fonden [NNF12OC1016062]; European Research Council [233285]info:eu-repo/semantics/publishedVersio

    Logical Development of the Cell Ontology

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Cell Ontology (CL) is an ontology for the representation of <it>in vivo </it>cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. By making the information in the ontology computable, we can use automated reasoners to detect errors and assist with classification. Here we report on the generation of computable definitions for the hematopoietic cell types in the CL.</p> <p>Results</p> <p>Computable definitions for over 340 CL classes have been created using a genus-differentia approach. These define cell types according to multiple axes of classification such as the protein complexes found on the surface of a cell type, the biological processes participated in by a cell type, or the phenotypic characteristics associated with a cell type. We employed automated reasoners to verify the ontology and to reveal mistakes in manual curation. The implementation of this process exposed areas in the ontology where new cell type classes were needed to accommodate species-specific expression of cellular markers. Our use of reasoners also inferred new relationships within the CL, and between the CL and the contributing ontologies. This restructured ontology can be used to identify immune cells by flow cytometry, supports sophisticated biological queries involving cells, and helps generate new hypotheses about cell function based on similarities to other cell types.</p> <p>Conclusion</p> <p>Use of computable definitions enhances the development of the CL and supports the interoperability of OBO ontologies.</p

    Chronic disease prevalence from Italian administrative databases in the VALORE project: a validation through comparison of population estimates with general practice databases and national survey

    Get PDF
    BACKGROUND: Administrative databases are widely available and have been extensively used to provide estimates of chronic disease prevalence for the purpose of surveillance of both geographical and temporal trends. There are, however, other sources of data available, such as medical records from primary care and national surveys. In this paper we compare disease prevalence estimates obtained from these three different data sources. METHODS: Data from general practitioners (GP) and administrative transactions for health services were collected from five Italian regions (Veneto, Emilia Romagna, Tuscany, Marche and Sicily) belonging to all the three macroareas of the country (North, Center, South). Crude prevalence estimates were calculated by data source and region for diabetes, ischaemic heart disease, heart failure and chronic obstructive pulmonary disease (COPD). For diabetes and COPD, prevalence estimates were also obtained from a national health survey. When necessary, estimates were adjusted for completeness of data ascertainment. RESULTS: Crude prevalence estimates of diabetes in administrative databases (range: from 4.8% to 7.1%) were lower than corresponding GP (6.2%-8.5%) and survey-based estimates (5.1%-7.5%). Geographical trends were similar in the three sources and estimates based on treatment were the same, while estimates adjusted for completeness of ascertainment (6.1%-8.8%) were slightly higher. For ischaemic heart disease administrative and GP data sources were fairly consistent, with prevalence ranging from 3.7% to 4.7% and from 3.3% to 4.9%, respectively. In the case of heart failure administrative estimates were consistently higher than GPs' estimates in all five regions, the highest difference being 1.4% vs 1.1%. For COPD the estimates from administrative data, ranging from 3.1% to 5.2%, fell into the confidence interval of the Survey estimates in four regions, but failed to detect the higher prevalence in the most Southern region (4.0% in administrative data vs 6.8% in survey data). The prevalence estimates for COPD from GP data were consistently higher than the corresponding estimates from the other two sources. CONCLUSION: This study supports the use of data from Italian administrative databases to estimate geographic differences in population prevalence of ischaemic heart disease, treated diabetes, diabetes mellitus and heart failure. The algorithm for COPD used in this study requires further refinement

    The Canine Papillomavirus and Gamma HPV E7 Proteins Use an Alternative Domain to Bind and Destabilize the Retinoblastoma Protein

    Get PDF
    The high-risk HPV E6 and E7 proteins cooperate to immortalize primary human cervical cells and the E7 protein can independently transform fibroblasts in vitro, primarily due to its ability to associate with and degrade the retinoblastoma tumor suppressor protein, pRb. The binding of E7 to pRb is mediated by a conserved Leu-X-Cys-X-Glu (LXCXE) motif in the conserved region 2 (CR2) of E7 and this domain is both necessary and sufficient for E7/pRb association. In the current study, we report that the E7 protein of the malignancy-associated canine papillomavirus type 2 encodes an E7 protein that has serine substituted for cysteine in the LXCXE motif. In HPV, this substitution in E7 abrogates pRb binding and degradation. However, despite variation at this critical site, the canine papillomavirus E7 protein still bound and degraded pRb. Even complete deletion of the LXSXE domain of canine E7 failed to interfere with binding to pRb in vitro and in vivo. Rather, the dominant binding site for pRb mapped to the C-terminal domain of canine E7. Finally, while the CR1 and CR2 domains of HPV E7 are sufficient for degradation of pRb, the C-terminal region of canine E7 was also required for pRb degradation. Screening of HPV genome sequences revealed that the LXSXE motif of the canine E7 protein was also present in the gamma HPVs and we demonstrate that the gamma HPV-4 E7 protein also binds pRb in a similar way. It appears, therefore, that the type 2 canine PV and gamma-type HPVs not only share similar properties with respect to tissue specificity and association with immunosuppression, but also the mechanism by which their E7 proteins interact with pRb

    Chapter 12: Systematic Review of Prognostic Tests

    Get PDF
    A number of new biological markers are being studied as predictors of disease or adverse medical events among those who already have a disease. Systematic reviews of this growing literature can help determine whether the available evidence supports use of a new biomarker as a prognostic test that can more accurately place patients into different prognostic groups to improve treatment decisions and the accuracy of outcome predictions. Exemplary reviews of prognostic tests are not widely available, and the methods used to review diagnostic tests do not necessarily address the most important questions about prognostic tests that are used to predict the time-dependent likelihood of future patient outcomes. We provide suggestions for those interested in conducting systematic reviews of a prognostic test. The proposed use of the prognostic test should serve as the framework for a systematic review and to help define the key questions. The outcome probabilities or level of risk and other characteristics of prognostic groups are the most salient statistics for review and perhaps meta-analysis. Reclassification tables can help determine how a prognostic test affects the classification of patients into different prognostic groups, hence their treatment. Review of studies of the association between a potential prognostic test and patient outcomes would have little impact other than to determine whether further development as a prognostic test might be warranted

    Accommodating Ontologies to Biological Reality—Top-Level Categories of Cumulative-Constitutively Organized Material Entities

    Get PDF
    BACKGROUND: The Basic Formal Ontology (BFO) is a top-level formal foundational ontology for the biomedical domain. It has been developed with the purpose to serve as an ontologically consistent template for top-level categories of application oriented and domain reference ontologies within the Open Biological and Biomedical Ontologies Foundry (OBO). BFO is important for enabling OBO ontologies to facilitate in reliably communicating and managing data and metadata within and across biomedical databases. Following its intended single inheritance policy, BFO's three top-level categories of material entity (i.e. ‘object’, ‘fiat object part’, ‘object aggregate’) must be exhaustive and mutually disjoint. We have shown elsewhere that for accommodating all types of constitutively organized material entities, BFO must be extended by additional categories of material entity. METHODOLOGY/PRINCIPAL FINDINGS: Unfortunately, most biomedical material entities are cumulative-constitutively organized. We show that even the extended BFO does not exhaustively cover cumulative-constitutively organized material entities. We provide examples from biology and everyday life that demonstrate the necessity for ‘portion of matter’ as another material building block. This implies the necessity for further extending BFO by ‘portion of matter’ as well as three additional categories that possess portions of matter as aggregate components. These extensions are necessary if the basic assumption that all parts that share the same granularity level exhaustively sum to the whole should also apply to cumulative-constitutively organized material entities. By suggesting a notion of granular representation we provide a way to maintain the single inheritance principle when dealing with cumulative-constitutively organized material entities. CONCLUSIONS/SIGNIFICANCE: We suggest to extend BFO to incorporate additional categories of material entity and to rearrange its top-level material entity taxonomy. With these additions and the notion of granular representation, BFO would exhaustively cover all top-level types of material entities that application oriented ontologies may use as templates, while still maintaining the single inheritance principle
    corecore