112 research outputs found

    Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality

    Full text link
    The final publication is available at Springer via http://dx.doi.org/DOI 10.1007/s10618-014-0378-6. Published online.Knowledge discovery on biomedical data can be based on on-line, data-stream analyses, or using retrospective, timestamped, off-line datasets. In both cases, changes in the processes that generate data or in their quality features through time may hinder either the knowledge discovery process or the generalization of past knowledge. These problems can be seen as a lack of data temporal stability. This work establishes the temporal stability as a data quality dimension and proposes new methods for its assessment based on a probabilistic framework. Concretely, methods are proposed for (1) monitoring changes, and (2) characterizing changes, trends and detecting temporal subgroups. First, a probabilistic change detection algorithm is proposed based on the Statistical Process Control of the posterior Beta distribution of the Jensen–Shannon distance, with a memoryless forgetting mechanism. This algorithm (PDF-SPC) classifies the degree of current change in three states: In-Control, Warning, and Out-of-Control. Second, a novel method is proposed to visualize and characterize the temporal changes of data based on the projection of a non-parametric information-geometric statistical manifold of time windows. This projection facilitates the exploration of temporal trends using the proposed IGT-plot and, by means of unsupervised learning methods, discovering conceptually-related temporal subgroups. Methods are evaluated using real and simulated data based on the National Hospital Discharge Survey (NHDS) dataset.The work by C Saez has been supported by an Erasmus Lifelong Learning Programme 2013 Grant. This work has been supported by own IBIME funds. The authors thank Dr. Gregor Stiglic, from the Univeristy of Maribor, Slovenia, for his support on the NHDS data.Sáez Silvestre, C.; Pereira Rodrigues, P.; Gama, J.; Robles Viejo, M.; García Gómez, JM. (2014). Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality. Data Mining and Knowledge Discovery. 28:1-1. doi:10.1007/s10618-014-0378-6S1128Aggarwal C (2003) A framework for diagnosing changes in evolving data streams. In Proceedings of the International Conference on Management of Data ACM SIGMOD, pp 575–586Amari SI, Nagaoka H (2007) Methods of information geometry. American Mathematical Society, Providence, RIArias E (2014) United states life tables, 2009. Natl Vital Statist Rep 62(7): 1–63Aspden P, Corrigan JM, Wolcott J, Erickson SM (2004) Patient safety: achieving a new standard for care. Committee on data standards for patient safety. The National Academies Press, Washington, DCBasseville M, Nikiforov IV (1993) Detection of abrupt changes: theory and application. Prentice-Hall Inc, Upper Saddle River, NJBorg I, Groenen PJF (2010) Modern multidimensional scaling: theory and applications. Springer, BerlinBowman AW, Azzalini A (1997) Applied smoothing techniques for data analysis: the Kernel approach with S-plus illustrations (Oxford statistical science series). Oxford University Press, OxfordBrandes U, Pich C (2007) Eigensolver methods for progressive multidimensional scaling of large data. In: Kaufmann M, Wagner D (eds) Graph drawing. Lecture notes in computer science, vol 4372. Springer, Berlin, pp 42–53Brockwell P, Davis R (2009) Time series: theory and methods., Springer series in statisticsSpringer, BerlinCesario SK (2002) The “Christmas Effect” and other biometeorologic influences on childbearing and the health of women. J Obstet Gynecol Neonatal Nurs 31(5):526–535Chakrabarti K, Garofalakis M, Rastogi R, Shim K (2001) Approximate query processing using wavelets. VLDB J 10(2–3):199–223Cruz-Correia RJ, Pereira Rodrigues P, Freitas A, Canario Almeida F, Chen R, Costa-Pereira A (2010) Data quality and integration issues in electronic health records. Information discovery on electronic health records, pp 55–96Csiszár I (1967) Information-type measures of difference of probability distributions and indirect observations. Studia Sci Math Hungar 2:299–318Dasu T, Krishnan S, Lin D, Venkatasubramanian S, Yi K (2009) Change (detection) you can believe. In: Finding distributional shifts in data streams. In: Proceedings of the 8th international symposium on intelligent data analysis: advances in intelligent data analysis VIII, IDA ’09. Springer, Berlin, pp 21–34Endres D, Schindelin J (2003) A new metric for probability distributions. IEEE Trans Inform Theory 49(7):1858–1860Gama J, Gaber MM (2007) Learning from data streams: processing techniques in sensor networks. Springer, BerlinGama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Bazzan A, Labidi S (eds) Advances in artificial intelligence—SBIA 2004., Lecture notes in computer scienceSpringer, Berlin, pp 286–295Gama J (2010) Knowledge discovery from data streams, 1st edn. Chapman & Hall, LondonGehrke J, Korn F, Srivastava D (2001) On computing correlated aggregates over continual data streams. SIGMOD Rec 30(2):13–24Guha S, Shim K, Woo J (2004) Rehist: relative error histogram construction algorithms. In: Proceedings of the thirtieth international conference on very large data bases VLDB, pp 300–311Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Burlington, MAHowden LM, Meyer JA, (2011) Age and sex composition. 2010 Census Briefs US Department of Commerce. Economics and Statistics Administration, US Census BureauHrovat G, Stiglic G, Kokol P, Ojstersek M (2014) Contrasting temporal trend discovery for large healthcare databases. Comput Methods Program Biomed 113(1):251–257Keim DA (2000) Designing pixel-oriented visualization techniques: theory and applications. IEEE Trans Vis Comput Graph 6(1):59–78Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the thirtieth international conference on Very large data bases, VLDB Endowment, VLDB ’04, vol 30, pp 180–191Klinkenberg R, Renz I (1998) Adaptive information filtering: Learning in the presence of concept drifts. In: Workshop notes of the ICML/AAAI-98 workshop learning for text categorization. AAAI Press, Menlo Park, pp 33–40Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biolog Cybern 43(1):59–69Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inform Theory 37:145–151Mitchell TM, Caruana R, Freitag D, McDermott J, Zabowski D (1994) Experience with a learning personal assistant. Commun ACM 37(7):80–91Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of page-hinckley, an approach for fault detection in an agro-alimentary production system. In: Proceedings of the 5th Asian Control Conference, vol 2, pp 815–818National Research Council (2011) Explaining different levels of longevity in high-income countries. The National Academies Press, Washington, DCNHDS (2010) United states department of health and human services. Centers for disease control and prevention. National center for health statistics. National hospital discharge survey codebookNHDS (2014) National Center for Health Statistics, National Hospital Discharge Survey (NHDS) data, US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, Maryland. http://www.cdc.gov/nchs/nhds.htmPapadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: Proceedings of the 31st international conference on very large data bases, VLDB endowment, VLDB ’05, pp 697–708Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33(3):1065–1076Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New YorkRodrigues P, Correia R (2013) Streaming virtual patient records. In: Krempl G, Zliobaite I, Wang Y, Forman G (eds) Real-world challenges for data stream mining. University Magdeburg, Otto-von-Guericke, pp 34–37Rodrigues P, Gama J, Pedroso J (2008) Hierarchical clustering of time-series data streams. IEEE Trans Knowl Data Eng 20(5):615–627Rodrigues PP, Gama Ja (2010) A simple dense pixel visualization for mobile sensor data mining. In: Proceedings of the second international conference on knowledge discovery from sensor data, sensor-KDD’08. Springer, Berlin, pp 175–189Rodrigues PP, Gama J, Sebastiã o R (2010) Memoryless fading windows in ubiquitous settings. In Proceedings of ubiquitous data mining (UDM) workshop in conjunction with the 19th european conference on artificial intelligence—ECAI 2010, pp 27–32Rodrigues PP, Sebastiã o R, Santos CC (2011) Improving cardiotocography monitoring: a memory-less stream learning approach. In: Proceedings of the learning from medical data streams workshop. Bled, SloveniaRubner Y, Tomasi C, Guibas L (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vision 40(2):99–121Sebastião R, Gama J (2009) A study on change detection methods. In: 4th Portuguese conference on artificial intelligenceSebastião R, Gama J, Rodrigues P, Bernardes J (2010) Monitoring incremental histogram distribution for change detection in data streams. In: Gaber M, Vatsavai R, Omitaomu O, Gama J, Chawla N, Ganguly A (eds) Knowledge discovery from sensor data, vol 5840., Lecture notes in computer science. Springer, Berlin, pp 25–42Sebastião R, Silva M, Rabiço R, Gama J, Mendonça T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3–12Sáez C, Martínez-Miranda J, Robles M, García-Gómez JM (2012) O rganizing data quality assessment of shifting biomedical data. Stud Health Technol Inform 180:721–725Sáez C, Robles M, García-Gómez JM (2013) Comparative study of probability distribution distances to define a metric for the stability of multi-source biomedical research data. In: Engineering in medicine and biology society (EMBC), 2013 35th annual international conference of the IEEE, pp 3226–3229Sáez C, Robles M, García-Gómez JM (2014) Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances. Statist Method Med Res (forthcoming)Shewhart WA, Deming WE (1939) Statistical method from the viewpoint of quality control. Graduate School of the Department of Agriculture, Washington, DCShimazaki H, Shinomoto S (2010) Kernel bandwidth optimization in spike rate estimation. J Comput Neurosci 29(1–2):171–182Solberg LI, Engebretson KI, Sperl-Hillen JM, Hroscikoski MC, O’Connor PJ (2006) Are claims data accurate enough to identify patients for performance measures or quality improvement? the case of diabetes, heart disease, and depression. Am J Med Qual 21(4):238–245Spiliopoulou M, Ntoutsi I, Theodoridis Y, Schult R (2006) monic: modeling and monitoring cluster transitions. In: Proceedings of the 12th ACm SIGKDD international conference on knowledge discovery and data mining, KDD ’06. ACm, New York, NY, pp 706–711Stiglic G, Kokol P (2011) Interpretability of sudden concept drift in medical informatics domain. In Proceedings of the 2010 IEEE international conference on data mining workshops, pp 609–613Torgerson W (1952) Multidimensional scaling: I theory and method. Psychometrika 17(4):401–419Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manage Inform Syst 12(4):5–33Weiskopf NG, Weng C (2013) M ethods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 20(1):144–151Wellings K, Macdowall W, Catchpole M, Goodrich J (1999) Seasonal variations in sexual activity and their implications for sexual health promotion. J R Soc Med 92(2):60–64Westgard JO, Barry PL (2010) Basic QC practices: training in statistical quality control for medical laboratories. Westgard Quality Corporation, Madison, WIWidmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–10

    Usefulness of C-reactive protein as a marker of early post-infarct left ventricular systolic dysfunction

    Get PDF
    Objective To assess the usefulness of in-hospital measurement of C-reactive protein (CRP) concentration in comparison to well-established risk factors as a marker of post-infarct left ventricular systolic dysfunction (LVSD) at discharge. Materials and methods Two hundred and four consecutive patients with ST-segment-elevation myocardial infarction (STEMI) were prospectively enrolled into the study. CRP plasma concentrations were measured before reperfusion, 24 h after admission and at discharge with an ultra-sensitive latex immunoassay. Results CRP concentration increased significantly during the first 24 h of hospitalization (2.4 ± 1.9 vs. 15.7 ± 17.0 mg/L; p\0.001) and persisted elevated at discharge (14.7 ± 14.7 mg/L), mainly in 57 patients with LVSD (2.4 ± 1.8 vs. 25.0 ± 23.4 mg/L; p\0.001; CRP at discharge 21.9 ± 18.6 mg/L). The prevalence of LVSD was significantly increased across increasing tertiles of CRP concentration both at 24 h after admission (13.2 vs. 19.1 vs. 51.5 %; p\0.0001) and at discharge (14.7 vs. 23.5 vs. 45.6 %; p\0.0001). Multivariate analysis demonstrated CRP concentration at discharge to be an independent marker of early LVSD (odds ratio of 1.38 for a 10 mg/L increase, 95 % confidence interval 1.01–1.87; p\0.04). Conclusion Measurement of CRP plasma concentration at discharge may be useful as a marker of early LVSD in patients after a first STEMI

    Upregulation of Hemoglobin Expression by Oxidative Stress in Hepatocytes and Its Implication in Nonalcoholic Steatohepatitis

    Get PDF
    Recent studies revealed that hemoglobin is expressed in some non-erythrocytes and it suppresses oxidative stress when overexpressed. Oxidative stress plays a critical role in the pathogenesis of non-alcoholic steatohepatitis (NASH). This study was designed to investigate whether hemoglobin is expressed in hepatocytes and how it is related to oxidative stress in NASH patients. Analysis of microarray gene expression data revealed a significant increase in the expression of hemoglobin alpha (HBA1) and beta (HBB) in liver biopsies from NASH patients. Increased hemoglobin expression in NASH was validated by quantitative real time PCR. However, the expression of hematopoietic transcriptional factors and erythrocyte specific marker genes were not increased, indicating that increased hemoglobin expression in NASH was not from erythropoiesis, but could result from increased expression in hepatocytes. Immunofluorescence staining demonstrated positive HBA1 and HBB expression in the hepatocytes of NASH livers. Hemoglobin expression was also observed in human hepatocellular carcinoma HepG2 cell line. Furthermore, treatment with hydrogen peroxide, a known oxidative stress inducer, increased HBA1 and HBB expression in HepG2 and HEK293 cells. Importantly, hemoglobin overexpression suppressed oxidative stress in HepG2 cells. We concluded that hemoglobin is expressed by hepatocytes and oxidative stress upregulates its expression. Suppression of oxidative stress by hemoglobin could be a mechanism to protect hepatocytes from oxidative damage in NASH

    Influence of FTO rs9939609 and Mediterranean diet on body composition and weight loss: a randomized clinical trial

    Get PDF
    Background The Mediterranean diet (MeD) plays a key role in the prevention of obesity. Among the genes involved in obesity, the Fat mass and obesity-associated gene (FTO) is one of the most known, but its interaction with MeD remained uncertain so far. Methods We carried out a study on a sample of 188 Italian subjects, analyzing their FTO rs9939609 alleles, and the difference in body composition between the baseline and a 4-weeks nutritional intervention. The sample was divided into two groups: the control group of 49 subjects, and the MeD group of 139 subjects. Results We found significant relations between MeD and both variation of total body fat (ΔTBFat) (p = 0.00) and gynoid body fat (p = 0.04). ∆TBFat (kg) demonstrated to have a significant relation with the interaction diet-gene (p = 0.04), whereas FTO was associated with the variation of total body water (p = 0.02). Conclusions MeD demonstrated to be a good nutritional treatment to reduce the body fat mass, whereas data about FTO remain uncertain. Confirming or rejecting the hypothesis of FTO and its influence on body tissues during nutritional treatments is fundamental to decide whether its effect has to be taken into consideration during both development of dietetic plans and patients monitoring. Trial Registration ClinicalTrials.gov Id: NCT01890070. Registered 01 July 2013, https://clinicaltrials.gov/ct2/show/NCT0189007

    Iterative sorting reveals CD133+ and CD133- melanoma cells as phenotypically distinct populations

    Get PDF
    Background: The heterogeneity and tumourigenicity of metastatic melanoma is attributed to a cancer stem cell model, with CD133 considered to be a cancer stem cell marker in melanoma as well as other tumours, but its role has remained controversial. Methods: We iteratively sorted CD133+ and CD133- cells from 3 metastatic melanoma cell lines, and observed tumourigenicity and phenotypic characteristics over 7 generations of serial xeno-transplantation in NOD/SCID mice. Results: We demonstrate that iterative sorting is required to make highly pure populations of CD133+ and CD133- cells from metastatic melanoma, and that these two populations have distinct characteristics not related to the cancer stem cell phenotype. In vitro, gene set enrichment analysis indicated CD133+ cells were related to a proliferative phenotype, whereas CD133- cells were of an invasive phenotype. However, in vivo, serial transplantation of CD133+ and CD133- tumours over 7 generations showed that both populations were equally able to initiate and propagate tumours. Despite this, both populations remained phenotypically distinct, with CD133- cells only able to express CD133 in vivo and not in vitro. Loss of CD133 from the surface of a CD133+ cell was observed in vitro and in vivo, however CD133- cells derived from CD133+ retained the CD133+ phenotype, even in the presence of signals from the tumour microenvironment. Conclusion: We show for the first time the necessity of iterative sorting to isolate pure marker-positive and marker-negative populations for comparative studies, and present evidence that despite CD133+ and CD133- cells being equally tumourigenic, they display distinct phenotypic differences, suggesting CD133 may define a distinct lineage in melanoma

    Carbon Monoxide Promotes Respiratory Hemoproteins Iron Reduction Using Peroxides as Electron Donors

    Get PDF
    The physiological role of the respiratory hemoproteins (RH), hemoglobin and myoglobin, is to deliver O2 via its binding to their ferrous (FeII) heme-iron. Under variety of pathological conditions RH proteins leak to blood plasma and oxidized to ferric (FeIII, met) forms becoming the source of oxidative vascular damage. However, recent studies have indicated that both metRH and peroxides induce Heme Oxygenase (HO) enzyme producing carbon monoxide (CO). The gas has an extremely high affinity for the ferrous heme-iron and is known to reduce ferric hemoproteins in the presence of suitable electron donors. We hypothesized that under in vivo plasma conditions, peroxides at low concentration can assist the reduction of metRH in presence of CO. The effect of CO on interaction of metRH with hydrophilic or hydrophobic peroxides was analyzed by following Soret and visible light absorption changes in reaction mixtures. It was found that under anaerobic conditions and low concentrations of RH and peroxides mimicking plasma conditions, peroxides served as electron donors and RH were reduced to their ferrous carboxy forms. The reaction rates were dependent on CO as well as peroxide concentrations. These results demonstrate that oxidative activity of acellular ferric RH and peroxides may be amended by CO turning on the reducing potential of peroxides and facilitating the formation of redox-inactive carboxyRH. Our data suggest the possible role of HO/CO in protection of vascular system from oxidative damage

    Peroxiredoxin 3 Is a Redox-Dependent Target of Thiostrepton in Malignant Mesothelioma Cells

    Get PDF
    Thiostrepton (TS) is a thiazole antibiotic that inhibits expression of FOXM1, an oncogenic transcription factor required for cell cycle progression and resistance to oncogene-induced oxidative stress. The mechanism of action of TS is unclear and strategies that enhance TS activity will improve its therapeutic potential. Analysis of human tumor specimens showed FOXM1 is broadly expressed in malignant mesothelioma (MM), an intractable tumor associated with asbestos exposure. The mechanism of action of TS was investigated in a cell culture model of human MM. As for other tumor cell types, TS inhibited expression of FOXM1 in MM cells in a dose-dependent manner. Suppression of FOXM1 expression and coincidental activation of ERK1/2 by TS were abrogated by pre-incubation of cells with the antioxidant N-acetyl-L-cysteine (NAC), indicating its mechanism of action in MM cells is redox-dependent. Examination of the mitochondrial thioredoxin reductase 2 (TR2)-thioredoxin 2 (TRX2)-peroxiredoxin 3 (PRX3) antioxidant network revealed that TS modifies the electrophoretic mobility of PRX3. Incubation of recombinant human PRX3 with TS in vitro also resulted in PRX3 with altered electrophoretic mobility. The cellular and recombinant species of modified PRX3 were resistant to dithiothreitol and SDS and suppressed by NAC, indicating that TS covalently adducts cysteine residues in PRX3. Reduction of endogenous mitochondrial TRX2 levels by the cationic triphenylmethane gentian violet (GV) promoted modification of PRX3 by TS and significantly enhanced its cytotoxic activity. Our results indicate TS covalently adducts PRX3, thereby disabling a major mitochondrial antioxidant network that counters chronic mitochondrial oxidative stress. Redox-active compounds like GV that modify the TR2/TRX2 network may significantly enhance the efficacy of TS, thereby providing a combinatorial approach for exploiting redox-dependent perturbations in mitochondrial function as a therapeutic approach in mesothelioma

    Monitoring of microbial hydrocarbon remediation in the soil

    Get PDF
    Bioremediation of hydrocarbon pollutants is advantageous owing to the cost-effectiveness of the technology and the ubiquity of hydrocarbon-degrading microorganisms in the soil. Soil microbial diversity is affected by hydrocarbon perturbation, thus selective enrichment of hydrocarbon utilizers occurs. Hydrocarbons interact with the soil matrix and soil microorganisms determining the fate of the contaminants relative to their chemical nature and microbial degradative capabilities, respectively. Provided the polluted soil has requisite values for environmental factors that influence microbial activities and there are no inhibitors of microbial metabolism, there is a good chance that there will be a viable and active population of hydrocarbon-utilizing microorganisms in the soil. Microbial methods for monitoring bioremediation of hydrocarbons include chemical, biochemical and microbiological molecular indices that measure rates of microbial activities to show that in the end the target goal of pollutant reduction to a safe and permissible level has been achieved. Enumeration and characterization of hydrocarbon degraders, use of micro titer plate-based most probable number technique, community level physiological profiling, phospholipid fatty acid analysis, 16S rRNA- and other nucleic acid-based molecular fingerprinting techniques, metagenomics, microarray analysis, respirometry and gas chromatography are some of the methods employed in bio-monitoring of hydrocarbon remediation as presented in this review

    Nutraceutical therapies for atherosclerosis

    Get PDF
    Atherosclerosis is a chronic inflammatory disease affecting large and medium arteries and is considered to be a major underlying cause of cardiovascular disease (CVD). Although the development of pharmacotherapies to treat CVD has contributed to a decline in cardiac mortality in the past few decades, CVD is estimated to be the cause of one-third of deaths globally. Nutraceuticals are natural nutritional compounds that are beneficial for the prevention or treatment of disease and, therefore, are a possible therapeutic avenue for the treatment of atherosclerosis. The purpose of this Review is to highlight potential nutraceuticals for use as antiatherogenic therapies with evidence from in vitro and in vivo studies. Furthermore, the current evidence from observational and randomized clinical studies into the role of nutraceuticals in preventing atherosclerosis in humans will also be discussed
    corecore