749 research outputs found

    Data Driven Nonparametric Detection

    Get PDF
    The major goal of signal detection is to distinguish between hypotheses about the state of events based on observations. Typically, signal detection can be categorized into centralized detection, where all observed data are available for making decision, and decentralized detection, where only quantized data from distributed sensors are forwarded to a fusion center for decision making. While these problems have been intensively studied under parametric and semi-parametric models with underlying distributions being fully or partially known, nonparametric scenarios are not well understood yet. This thesis mainly explores nonparametric models with unknown underlying distributions as well as semi-parametric models as an intermediate step to solve nonparametric problems. One major topic of this thesis is on nonparametric decentralized detection, in which the joint distribution of the state of an event and sensor observations are not known, but only some training data are available. The kernel-based nonparametric approach has been proposed by Nguyen, Wainwright and Jordan where sensors\u27 quality is treated equally. We study heterogeneous sensor networks, and propose a weighted kernel so that weight parameters are utilized to selectively incorporate sensors\u27 information into the fusion center\u27s decision rule based on quality of sensors\u27 observations. Furthermore, weight parameters also serve as sensor selection parameters with nonzero parameters corresponding to sensors being selected. Sensor selection is jointly performed with decision rules of sensors and the fusion center with the resulting optimal decision rule having only a sparse number of nonzero weight parameters. A gradient projection algorithm and a Gauss-Seidel algorithm are developed to solve the risk minimization problem, which is non-convex, and both algorithms are shown to converge to critical points. The other major topic of this thesis is composite outlier detection in centralized scenarios. The goal is to detect the existence of data streams drawn from outlying distributions among data streams drawn from a typical distribution. We study both the semi-parametric model with known typical distribution and unknown outlying distributions, and the nonparametric model with unknown typical and outlying distributions. For both models, we construct generalized likelihood ratio tests (GLRT), and show that with the knowledge of the KL divergence between the outlier and typical distributions, GLRT is exponentially consistent (i.e, the error risk function decays exponentially fast). We also show that with the knowledge of the Chernoff distance between the outlying and typical distributions, GLRT for semi-parametric model achieves the same risk decay exponent as the parametric model, and GLRT for nonparametric model achieves the same performance when the number of data streams gets asymptotically large. We further show that for both models without any knowledge about the distance between distributions, there does not exist an exponentially consistent test. However, GLRT with a diminishing threshold can still be consistent

    Reference Evapotranspiration Changes in the Haihe River Basin during Past 50 Years

    Get PDF
    AbstractIn this paper, temporal trend of annual reference evapotranspiration (ET0) calculated using the FAO Penman-Monteith equation with the observed daily meteorological data at six stations (i.e. Datong, Weichang, Qinhuangdao, Tianjin, Beijing and Huimin) of Haihe River Basin, China, were detected with the help of parametric t-test and Mann-Kendall (MK) analysis. The six stations were divided in three different classes representing mountain (Datong and Weichang), continental (Beijing and Huimin) and coastal areas (Qinhuangdao and Tianjin), respectively. The result shows that there was a significant upward trend in ET0 at mountain area of the Haihe River Basin. On the contrary, a significant downward trend in ET0 can be found at coastal area. Moreover, the analyses of ET0 at continental area indicate that after 1960, ET0 of Beijing showed a sharp significant increase, while a moderate variation was presented for ET0 at Huimin

    USING ARTIFICIAL INTELLIGENCE TO IMPROVE HEALTHCARE QUALITY AND EFFICIENCY

    Get PDF
    In recent years, artificial intelligence (AI), especially machine learning (ML) and deep learning (DL), has represented one of the most exciting advances in science. The performance of ML-based AI in many areas, such as computer vision, voice recognition, and natural language processing has improved dramatically, offering unprecedented opportunities for application in a variety of different domains. In the critical domain of healthcare, great potential exists for a broader application of ML to improve quality and efficiency. At the same time, there are substantial challenges in the development and implementation of AI in healthcare. This dissertation aims to study the application of state-of-the-art AI technologies in healthcare, ranging from original method development to model interpretation and real-world implementation. First, a novel DL-based method is developed to efficiently analyze the rich and complex electronic health record data. This DL-based approach shows promise in facilitating the analysis of real-world data and can complement clinical knowledge by revealing deeper insights. Both knowledge discovery and performance of predictive models are demonstrably boosted by this method. Second, a recurrent neural network (named LSTM-DL) is developed and shown to outperform all existing methods in addressing an important real-world question, patient cost prediction. A series of novel analyses is used to derive a deeper understanding of deep learning’s advantages. The LSTM-DL model consistently outperforms other models with nearly the same level of advantages across different subgroups. Interestingly, the advantage of the LSTM-DL is significantly driven by the amount of fluctuation in the sequential data. By opening the “black box,” the parameters learned during the training period are examined, and is it demonstrated that LSTM-DL’s ability to react to high fluctuation is gained during the training rather than inherited from its special architecture. LSTM-DL can also learn to be less sensitive to fluctuations if the fluctuation is not playing an important role. Finally, the implementation of ML models in real practice is studied. Since at its current stage of development, ML-based AI will most likely assistant human workers rather than replace them, it is critical to understand how human workers collaborate with AI. An AI tool was developed in collaboration with a medical coding company, and successfully implemented in the real work environment. The impact of this tool on worker performance is examined. Findings show that use of AI can significantly boost the work productivity of human coders. The heterogeneity of AI’s effects is further investigated, and results show that the human circadian rhythm and coder seniority are both significant factors in conditioning productivity gains. One interesting finding regarding heterogeneity is that the AI has its best effects when a coder is at her/his peak of performance (as opposed to other times), which supports the theory of human-AI complementarity. However, this theory does not necessarily hold true across different coders. While it could be assumed that senior coders would benefit more from the AI, junior coders’ productivity is found to improve more. A further qualitative study uncovers the underlying mechanism driving this interesting effect: senior coders express strong resistance to AI, and their low trust in AI significantly hinders them from realizing the AI’s value

    The Impact of Stigmatizing Language in EHR Notes on AI Performance and Fairness

    Get PDF
    Today, there is significant interest in using electronic health record data to generate new clinical insights for diagnosis and treatment decisions. However, there are concerns that such data may be biased and result in accentuating racial disparities. We study how clinician biases reflected in EHR notes affect the performance and fairness of artificial intelligence models in the context of mortality prediction for intensive care unit patients. We apply a Transformer-based deep learning model and explainable AI techniques to quantify negative impacts on performance and fairness. Our findings demonstrate that stigmatizing language written by clinicians adversely affects AI performance, particularly so for black patients, highlighting SL as a source of racial disparity in AI model development. As an effective mitigation approach, removing SL from EHR notes can significantly improve AI performance and fairness. This study provides actionable insights for responsible AI development and contributes to understanding clinician EHR note writing

    People Talking and AI Listening: How Stigmatizing Language in EHR Notes Affect AI Performance

    Full text link
    Electronic health records (EHRs) serve as an essential data source for the envisioned artificial intelligence (AI)-driven transformation in healthcare. However, clinician biases reflected in EHR notes can lead to AI models inheriting and amplifying these biases, perpetuating health disparities. This study investigates the impact of stigmatizing language (SL) in EHR notes on mortality prediction using a Transformer-based deep learning model and explainable AI (XAI) techniques. Our findings demonstrate that SL written by clinicians adversely affects AI performance, particularly so for black patients, highlighting SL as a source of racial disparity in AI model development. To explore an operationally efficient way to mitigate SL's impact, we investigate patterns in the generation of SL through a clinicians' collaborative network, identifying central clinicians as having a stronger impact on racial disparity in the AI model. We find that removing SL written by central clinicians is a more efficient bias reduction strategy than eliminating all SL in the entire corpus of data. This study provides actionable insights for responsible AI development and contributes to understanding clinician behavior and EHR note writing in healthcare.Comment: 54 pages, 9 figure

    Formulation, antileukemia mechanism, pharmacokinetics, and biodistribution of a novel liposomal emodin

    Get PDF
    Emodin is a multifunctional Chinese traditional medicine with poor water solubility. D-α-tocopheryl polyethylene glycol 1000 succinate (TPGS) is a pegylated vitamin E derivate. In this study, a novel liposomal-emodin-conjugating TPGS was formulated and compared with methoxypolyethyleneglycol 2000-derivatized distearoyl-phosphatidylethanolamine (mPEG2000–DSPE) liposomal emodin. TPGS improved the encapsulation efficiency and stability of emodin egg phosphatidylcholine/cholesterol liposomes. A high encapsulation efficiency of 95.2% ± 3.0%, particle size of 121.1 ± 44.9 nm, spherical ultrastructure, and sustained in vitro release of TPGS liposomal emodin were observed; these were similar to mPEG2000–DSPE liposomes. Only the zeta potential of −13.1 ± 2.7 mV was significantly different to that for mPEG2000–DSPE liposomes. Compared to mPEG2000–DSPE liposomes, TPGS liposomes improved the cytotoxicity of emodin on leukemia cells by regulating the protein levels of myeloid cell leukemia 1 (Mcl-1), B-cell lymphoma-2 (Bcl-2) and Bcl-2-associated X protein, which was further enhanced by transferrin. TPGS liposomes prolonged the circulation time of emodin in the blood, with the area under the concentration–time curve (AUC) 1.7 times larger than for free emodin and 0.91 times larger than for mPEG2000–DSPE liposomes. In addition, TPGS liposomes showed higher AUC for emodin in the lung and kidney than for mPEG2000–DSPE liposomes, and both liposomes elevated the amount of emodin in the heart. Overall, TPGS is a pegylated agent that could potentially be used to compose a stable liposomal emodin with enhanced therapeutics
    corecore