22 research outputs found

    Selection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets

    Get PDF
    Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process of model development when missing data exist. Methods: Complete data on 2720 prisoners was available. A logistic regression model was fitted and served as gold standard. We then randomly omitted 20%, and 50% of data. Missing date were imputed 10 times, applying multiple imputation by chained equations (MICE). Rubin’s rule (RR) was applied to select candidate variables and to combine the results across imputed data sets. In S1, S2, and S3 methods, variables retained significant in one, five, and ten imputed data sets and were candidate for the multifactorial model. Two weighting approaches were also applied. Findings: Age of onset of drug use, recent use of drug before imprisonment, being single, and length of imprisonment were significantly associated with drug injection among prisoners. All variable selection schemes were able to detect significance of these variables. Conclusion: We have seen that the performances of easier variable selection methods were comparable with RR. This indicates that the screening step can be used to select candidate variables for the multifactorial model

    FRAMEWORK FOR RELIABILITY, MAINTAINABILITY AND AVAILABILITY ANALYSIS OF GAS PROCESSING SYSTEM DURING OPERATION PHASE

    Get PDF
    In facing many operation challenges such as increased expectation in bottom line performances and escalating overhead costs, petrochemical plants nowadays need to continually strive for higher reliability and availability by means of effective improvement tools. Reliability, maintainability and availability (RAM) analysis has been recognised as one of the strategic tools to improve plant's reliability at operation phase. Nevertheless, the application of RAM among industrial practitioners is still limited generally due to the impracticality and complexity of existing approaches. Hence, it is important to enhance the approaches so that they can be practically applied by companies to assist them in achieving their operational goals. The objectives of this research are to develop frameworks for applying reliability, maintainability and availability analysis of gas processing system at operation phase to improve system operational and maintenance performances. In addition, the study focuses on ways to apply existing statistical approach and incorporate inputs from field experts for prediction of reliability related measures. Furthermore, it explores and highlights major issues involved in implementing RAM analysis in oil and gas industry and offers viable solutions. In this study, systematic analysis on each RAM components are proposed and their roles as strategic improvement and decision making tools are discussed and demonstrated using case studies of two plant systems. In reliability and maintainability (R&M) analysis, two main steps; exploratory and inferential are proposed. Tools such as Pareto, trend plot and hazard functions; Kaplan Meier (KM) and proportional hazard model (PHM), are used in exploratory phase to identify critical elements to system's R&M performances. In inferential analysis, a systematic methodology is presented to assess R&M related measures

    Studies in condition based maintenance using proportional hazards models with imperfect observations

    Get PDF
    Introduction and literature review -- Preliminary notations -- problem statement -- Optimal inspection period and replacement policy for CBM with imperfect information using PHM -- Problem formulation -- Formulation of the POMDP -- Long-run average cost and total long-run average cost -- Optimal inspection period -- Numerical example -- Evaluating the remaining life for equipment with unobservable states -- Practical implications -- Model assumptions -- Development of parameter estimation methods for a condition based maintenance with indirect observations -- Proposed model -- Parameters' estimation -- Optimal inspection interval and optimal replacement policy -- Reliability function and mean residual life -- Estimation of the model's parameter

    Predicting the risk and trajectory of intensive care patients using survival models

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (p. 119-126).Using artificial intelligence to assist physicians in patient care has received sustained interest over the past several decades. Recently, with automated systems at most bedsides, the amount of patient information collected continues to increase, providing specific impetus for intelligent systems that can interpret this information. In fact, the large set of sensors and test results, often measured repeatedly over long periods of time, make it challenging for caregivers to quickly utilize all of the data for optimal patient treatment. This research focuses on predicting the survival of ICU patients throughout their stay. Unlike traditional static mortality models, this survival prediction is explored as an indicator of patient state and trajectory. Using survival analysis techniques and machine learning, models are constructed that predict individual patient survival probabilities at fixed intervals in the future. These models seek to help physicians interpret the large amount of data available in order to provide optimal patient care. We find that the survival predictions from our models are comparable to survival predictions using the SAPS score, but are available throughout the patient's ICU course instead of only at 24 hours after admission. Additionally, we demonstrate effective prediction of patient mortality over fixed windows in the future.by Caleb W. Hug.S.M

    MULTIPLE IMPUTATION TO CORRECT FOR MEASUREMENT ERROR: Application to Chronic Disease Case Ascertainment in Administrative Health Databases

    Get PDF
    Diagnosis codes in administrative health databases (AHDs) are commonly used to ascertain chronic disease cases for research and surveillance. Low sensitivity of diagnosis codes has been demonstrated in many studies that validate AHDs against a gold standard data source in which the true disease status is known. This will result in misclassification of disease status, which can lead to biased prevalence estimates and loss of power to detect associations between diseases status and health outcomes. Model-based case detection algorithms in combination with multiple imputation (MI) methods in validation dataset/main dataset designs could be used to correct for misclassification of chronic disease status in AHDs. Under this approach, a predictive model of disease status (e.g., logistic model) is constructed in the validation dataset, the model parameters are estimated and MI methods are used to impute true disease status in the main dataset. This research considered scenarios that the misclassification of the observed disease status is independent of disease predictors and dependent on disease predictors. When the misclassification of the observed disease status is independent of disease predictors, the MI methods based on Frequentist logistic model (with and without bias correction) and Bayesian logistic model were compared. And when the misclassification of the observed disease status is dependent on disease predictors, the MI based on Frequentist logistic model with different variables as covariates were compared. Monte Carlo techniques were used to investigate the effects of the following data and model characteristics on bias and error in chronic disease prevalence estimates from AHDs: sensitivity of observed disease status based on diagnosis codes, size of the validation dataset, number of imputations, and the magnitude of measurement error in covariates of the predictive model. Relative bias, root mean squared error and coverage of 95% confidence interval were used to measure the performance. Without bias correction, the Bayesian MI model has lower RMSE than the Frequentist MI model. And the Frequentist MI model with bias correction is demonstrated via a simulation study to have superior performance to Bayesian MI model and the Frequentist MI model without bias correction. The results indicate that MI works well for measurement error correction if the missing true values are not missing not at random no matter whether the observed disease diagnosis is dependent on other disease predictors or not. Increasing the size of the validation dataset can improve the performance of MI better than increasing the number of imputations

    A DYNAMIC LANDSCAPE OF FEAR: HUMAN IMPACTS ON CARNIVORE COMMUNITIES

    Get PDF
    Mammalian carnivores are elusive, enigmatic species that often play keystone roles in ecosystems through direct (i.e., predation) and indirect (i.e., perceived predation risk) effects. Worldwide many carnivore species are experiencing rapid human-mediated population declines due to landscape change and habitat disturbance. For researchers, carnivores present unique challenges due to their large home ranges, low population densities, sensitivity to human disturbance, and direct persecution. Further, growing evidence shows that human activity can impact carnivore behavior and community structure by altering predator-prey interactions, shifting diel activity patterns, and altering wildlife movement leading to increased sightings, nuisance reports, and harvests. To investigate how human activity influences U.S. carnivore communities, I explored variation in spatiotemporal activity of American black bear and bobcat, and assessed carnivore co-occurrence using camera trap data. I constructed diel activity density curves, applied multispecies occupancy models, and calculated attraction-avoidance ratios to describe relationships among members of the carnivore guild relative to various types of human activity. My results suggested the bobcat can function as a dominant carnivore dependent on community structure, with dominant carnivores (i.e., wolves, pumas) influenced primarily by human-related factors, and subordinate carnivores (i.e., foxes) impacted by environmental factors. Further, American black bear activity did not vary with different types of human activity, yet protected areas were positively associated with black bear presence during the annual hunting season along with increased nocturnal activity. Understanding the influence human activity has on carnivore community dynamics is critical for establishing successful management practices to promote the persistence of carnivore guilds

    FRAMEWORK FOR RELIABILITY, MAINTAINABILITY AND AVAILABILITY ANALYSIS OF GAS PROCESSING SYSTEM DURING OPERATION PHASE

    Get PDF
    In facing many operation challenges such as increased expectation in bottom line performances and escalating overhead costs, petrochemical plants nowadays need to continually strive for higher reliability and availability by means of effective improvement tools. Reliability, maintainability and availability (RAM) analysis has been recognised as one of the strategic tools to improve plant's reliability at operation phase. Nevertheless, the application of RAM among industrial practitioners is still limited generally due to the impracticality and complexity of existing approaches. Hence, it is important to enhance the approaches so that they can be practically applied by companies to assist them in achieving their operational goals. The objectives of this research are to develop frameworks for applying reliability, maintainability and availability analysis of gas processing system at operation phase to improve system operational and maintenance performances. In addition, the study focuses on ways to apply existing statistical approach and incorporate inputs from field experts for prediction of reliability related measures. Furthermore, it explores and highlights major issues involved in implementing RAM analysis in oil and gas industry and offers viable solutions. In this study, systematic analysis on each RAM components are proposed and their roles as strategic improvement and decision making tools are discussed and demonstrated using case studies of two plant systems. In reliability and maintainability (R&M) analysis, two main steps; exploratory and inferential are proposed. Tools such as Pareto, trend plot and hazard functions; Kaplan Meier (KM) and proportional hazard model (PHM), are used in exploratory phase to identify critical elements to system's R&M performances. In inferential analysis, a systematic methodology is presented to assess R&M related measures

    Statistical models in prognostic modelling with many skewed variables and missing data: a case study in breast cancer

    Get PDF
    Prognostic models have clinical appeal to aid therapeutic decision making. In the UK, the Nottingham Prognostic Index (NPI) has been used, for over two decades, to inform patient management. However, it has been commented that NPI is not capable of identifying a subgroup of patients with a prognosis so good that adjuvant therapy with potential harmful side effects can be withheld safely. Tissue Microarray Analysis (TMA) now makes possible measurement of biological tissue microarray features of frozen biopsies from breast cancer tumours. These give an insight to the biology of tumour and hence could have the potential to enhance prognostic modelling. I therefore wished to investigate whether biomarkers can add value to clinical predictors to provide improved prognostic stratification in terms of Recurrence Free Survival (RFS). However, there are very many biomarkers that could be measured, they usually exhibit skewed distribution and missing values are common. The statistical issues raised are thus number of variables being tested, form of the association, imputation of missing data, and assessment of the stability and internal validity of the model. Therefore the specific aim of this study was to develop and to demonstrate performance of statistical modelling techniques that will be useful in circumstances where there is a surfeit of explanatory variables and missing data; in particular to achieve useful and parsimonious models while guarding against instability and overfitting. I also sought to identify a subgroup of patients with a prognosis so good that a decision can be made to avoid adjuvant therapy. I aimed to provide statistically robust answers to a set of clinical question and develop strategies to be used in such data sets that would be useful and acceptable to clinicians. A unique data set of 401 Estrogen Receptor positive (ER+) tamoxifen treated breast cancer patients with measurement for a large panel of biomarkers (72 in total) was available. Taking a statistical approach, I applied a multi-faceted screening process to select a limited set of potentially informative variables and to detect the appropriate form of the association, followed by multiple imputations of missing data and bootstrapping. In comparison with the NPI, the final joint model derived assigned patients into more appropriate risk groups (14% of recurred and 4% of non-recurred cases). The actuarial 7-year RFS rate for patients in the lowest risk quartile was 95% (95% C.I.: 89%, 100%). To evaluate an alternative approach, biological knowledge was incorporated into the process of model development. Model building began with the use of biological expertise to divide the variables into substantive biomarker sets on the basis of presumed role in the pathway to cancer progression. For each biomarker family, an informative and parsimonious index was generated by combining family variables, to be offered to the final model as intermediate predictor. In comparison with NPI, patients into more appropriate risk groups (21% of recurred and 11% of non-recurred patients). This model identified a low-risk group with 7-year RFS rate at 98% (95% C.I.: 96%, 100%)

    Fractional polynomial and restricted cubic spline models as alternatives to categorising continuous data: applications in medicine

    Get PDF
    Continuous predictor variables are often categorised when reporting their influence on the outcome of interest. This does not make use of within category information. Alternative methods of handling continuous predictor variables such as fractional polynomials (FPs) and restricted cubic splines (RCS) exist. This thesis first investigates the current extent of categorisation in comparison to these alternative methods. The performances of categorisation, linearisation, FPs and RCS approaches are then investigated using novel simulations, assuming a range of plausible scenarios including tick-shaped associations. The simulation starts with continuous outcomes, and then move onto predictive models where the outcome itself is dichotomised into a binary outcome. Finally, a novel application of the four methods is performed using the UK Biobank data – incorporating additional issues of confounding and interaction. This thesis shows that the practice of categorisation is still widely used in epidemiology, whilst alternative methods such as FPs and RCS are not. In addition, this research shows that categorising continuous variable into few categories produce functions with large RMSEs, obscure true relations and have less predictive ability than the linear, FP and RCS models. Finally, this thesis shows that nonlinearity and interaction terms are more easily detected when applying FPs and RCS methods. The thesis concludes by encouraging medical researchers to consider the application of FPs and RCS models in their studies

    Prognostic factors for epilepsy

    Get PDF
    Introduction and Aims: Epilepsy is a neurological disorder and is a heterogeneous condition both in terms of cause and prognosis. Prognostic factors identify patients at varying degrees of risk for specific outcomes which facilitates treatment choice and aids patient counselling. Few prognostic models based on prospective cohorts or randomised controlled trial data have been published in epilepsy. Patients with epilepsy can be loosely categorised as having had a first seizure, being newly diagnosed with epilepsy, having established epilepsy or frequent unremitting seizures despite optimum treatment. This thesis concerns modelling prognostic factors for these patient groups, for outcomes including seizure recurrence, seizure remission and treatment failure. Methods: Methods for modelling prognostic factors are discussed and applied to several examples including eligibility to drive following a first seizure and following withdrawal of treatment after a period of remission from seizures. Internal and external model validation techniques are reviewed. The latter is investigated further in a simulation study, the results of which are demonstrated in a motivating example. Mixture modelling is introduced and assessed to better predict whether a patient would achieve remission from seizures immediately, at a later time point, or whether they may never achieve remission. Results: Multivariable models identified a number of significant factors. Future risk of a seizure was therefore obtained for various patient subgroups. The models identified that the chance of a second seizure was below the risk threshold for driving, set by the DVLA, after six months, and the risk of a seizure following treatment withdrawal after a period of remission from seizures was below the risk threshold after three months. Selected models were found to be internally valid and the simulation study indicated that concordance and a variety of imputation methods for handling covariates missing from the validation dataset were useful approaches for external validation of prognostic models. Assessing these methods for a selected model indicated that the model was valid in independent datasets. Mixture modelling techniques begin to show an improved prognostic model for the frequently reported outcome time to 12-month remission. Conclusions: The models described within this thesis can be used to predict outcome for patients with first seizures or epilepsy aiding individual patient risk stratification and the design and analysis of future epilepsy trials. Prognostic models are not commonly externally validated. A method of external validation in the presence of a missing covariate has been proposed and may facilitate validation of prognostic models making the evidence base more transparent and reliable and instil confidence in any significant findings
    corecore