Search CORE

20 research outputs found

Does the Missing Data Imputation Method Affect the Composition and Performance of Prognostic Models?

Author: Baneshi MR
Talei AR
Publication venue
Publication date
Field of study

Background: We already showed the superiority of imputation of missing data (via Multivariable Imputation via Chained Equations (MICE) method) over exclusion of them; however, the methodology of MICE is complicated. Furthermore, easier imputation methods are available. The aim of this study was to compare them in terms of model composition and performance. Methods: Three hundreds and ten breast cancer patients were recruited. Four approaches were applied to impute missing data. First we adopted an ad hoc method in which missing data for each variable was replaced by the median of observed values. Then 3 likelihood-based approaches were used. In the regression imputation, a regression model compared the variable with missing data to the rest of the variables. The regression equation was used to fill the missing data. The Expectation Maximum (E-M) algorithm was implemented in which missing data and regression parameters were estimated iteratively until convergence of regression parameters. Finally, the MICE method was applied. Models developed were compared in terms of variables significantly contributed to the multifactorial analysis, sensitivity and specificity. Results: All candidate variables significantly contributed to the MICE model. However, grade of disease lost its effect in other three models. The MICE model showed the best performance followed by E-M model. Conclusion: Among imputation methods, final models were not the same, in terms of composition and performance. Therefore, modern imputation methods are recommended to recover the information

Simorgh Research Repository

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Author: Baneshi MR
Talei AR
Publication venue
Publication date
Field of study

Background: Multifactorial regression models are frequently used in medicine to estimate survival rate of patients across risk groups. However, their results are not generalisable, if in the development of models assumptions required are not satisfied. Missing data is a common problem in pathology. The aim of this paper is to address the danger of exclusion of cases with missing data, and to highlight the importance of imputation of missing data before development of multifactorial models. Methods: This study was performed on 310 breast cancer patients diagnosed in Shiraz (Southern Iran). Performing a complete-case Cox regression model, a prognostic index was calculated so as to categorise the patients into 3 risk groups. Then, applying the Multivariate Imputation via Chained Equations (MICE) method, missing data were imputed 10 times. Using imputed data sets, modelling was performed to assign patients into risk groups. Estimated actuarial Overal Survival (OS) rates corresponding to analysis of complete-case and imputed data sets were compared. Results: Cases with at least one missing datum experienced a significantly better survival curve. Estimates derived analysing complete-case data, relative to imputed data sets, underestimated the OS rate in all risk groups. In addition confidence intervals were wider indicating loss in precision due to attrition in sample size and power. Conclusion: Results obtained highlighted the danger of exclusion of missing data. Imputation of missing data avoids biased estimates, increases the precision of estimates, and improves genralisability of results to other similar populations

Simorgh Research Repository

Survival Models in Breast Cancer Patients

Author: Baneshi MR
Mehrabani D
Rajaeefard AR
Talei AR
Publication venue
Publication date
Field of study

Background: Breast cancer is the most prevalent malignancy among Iranian women. Five and ten year survival is one of the indicators used for evaluation of the quality of care after surgery. In this study, we used several survival models to determine risk factors, survival times and life expectancies of different types of surgery. Methods: This study was performed on 310 patients who underwent surgery during a ten years period. Logistic regression and Cox regression models were used to analyze the factors leading to death. The Kaplan-Meier method (non-parametric) was used to estimate the survival rate. The log-rank test was used to compare survival in different groups. To compare life expectancy of different types of surgery, we used the actuarial life table method. Results: Logistic regression showed that stage, grade, age and history of benign malignancy had significant relationship with death. Log-rank test showed that there was a significant difference between survival for patients with different stages, age and history of benign tumors. Cox regression model demonstrated that the variables of stage, grade, age and benign problems were the major risk factors. Actuarial life table model showed that the life expectancy for all patients was 10.03 years. This life expectancy in early stages of breast cancer for mastectomy and lumpectomy were 8.99 and 8.35 years, respectively, which was not significant. Conclusion: It can be concluded that the higher stage, grade, age and history of benign tumor were, the most important risk factors were correlated to mortality in breast cancer patients. This study showed that there was no significant difference between life expectancies of mastectomy and lumpectomy surgery

Simorgh Research Repository

Tamoxifen resistance in early breast cancer: statistical modelling of tissue markers to improve risk prediction

Author: Anderson N
Baneshi MR
Bartlett JMS
Cooke TG
Edwards J
Warner P
Publication venue
Publication date
Field of study

BACKGROUND: For over two decades, the Nottingham Prognostic Index (NPI) has been used in the United Kingdom to calculate risk scores and inform management about breast cancer patients. It is derived using just three clinical variables – nodal involvement, tumour size and grade. New scientific methods now make cost-effective measurement of many biological characteristics of tumour tissue from breast cancer biopsy samples possible. However, the number of potential explanatory variables to be considered presents a statistical challenge. The aim of this study was to investigate whether in ERþ tamoxifen-treated breast cancer patients, biological variables can add value to NPI predictors, to provide improved prognostic stratification in terms of overall recurrence-free survival (RFS) and also in terms of remaining recurrence free while on tamoxifen treatment (RFoT). A particular goal was to enable the discrimination of patients with a very low risk of recurrence. METHODS: Tissue samples of 401 cases were analysed by microarray technology, providing biomarker data for 72 variables in total, from AKT, BAD, HER, MTOR, PgR, MAPK and RAS families. Only biomarkers screened as potentially informative (i.e., exhibiting univariate association with recurrence) were offered to the multivariate model. The multiple imputation method was used to deal with missing values, and bootstrap sampling was used to assess internal validity and refine the model. RESULTS: Neither the RFS nor RFoT models derived included Grade, but both had better predictive and discrimination ability than NPI. A slight difference was observed between models in terms of biomarkers included, and, in particular, the RFoT model alone included HER2. The estimated 7-year RFS rates in the lowest-risk groups by RFS and RFoT models were 95 and 97%, respectively, whereas the corresponding rate for the lowest-risk group of NPI was 89%. CONCLUSION: The findings demonstrate considerable potential for improved prognostic modelling by incorporation of biological variables into risk prediction. In particular, the ability to identify a low-risk group with minimal risk of recurrence is likely to have clinical appeal. With larger data sets and longer follow-up, this modelling approach has the potential to enhance an understanding of the interplay of biological characteristics, treatment and cancer recurrence. British Journal of Cancer (2010) 102

Simorgh Research Repository

Parental physical activity, safety perceptions and children’s independent mobility

Author: A Carver
A Fyhri
A Timperio
A Timperio
AC Gielen
AE Bauman
AJ Romero
AN Pizarro
Andreia N Pizarro
AS Page
AS Page
BE Saelens
BF Fuemmeler
CL Craig
D Crawford
Elisa A Marques
FR Alparone
G Hawthorne
G Valentine
G Valentine
I Janssen
IPAQ
J Veitch
JA Reed
JF Sallis
JJ Prochaska
Jorge Mota
K Malone
K Van Der Horst
L Grize
L Karsten
L Uijtdewilligen
LA Weir
LA Weir
LE McCurdy
M Hillman
M Kytta
M Kyttä
M Linting
M Prezza
M Prezza
MA Davenport
Maria Paula Santos
MR Baneshi
R Jago
R Mackett
S Wilcox
SL Gustafson
V Carson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Assessment of Internal Validity of Prognostic Models through Bootstrapping and Multiple Imputation of Missing Data

Author: A Talei
MR Baneshi
Publication venue: Tehran University of Medical Sciences
Publication date: 01/04/2012
Field of study

Background:Prognostic models have clinical appeal to aid therapeutic decision making.Two main practical challenges in development of such models are assessment of validity of models and imputation of missing data.In this study,importance of imputation of missing data and application of bootstrap technique in development, simplification, and assessment of internal validity of a prognostic model is highlighted.Methods: Overall, 310 breast cancer patients were recruited. Missing data were imputed 10 times. Then to deal with sensitivity of the model due to small changes in the data (internal validity), 100 bootstrap samples were drawn from each of 10 imputed data sets leading to 1000 samples. A Cox regression model was fitted to each of 1000 samples. Only variables retained in more than 50% of samples were used in development of final model. Results: Four variables retained significant in more than 50% (i.e. 500 samples) of bootstrap samples; tumour size (91%), tumour grade (64%), history of benign breast disease (77%), and age at diagnosis (59%). Tumour size was the strongest predictor with inclusion frequency exceeding 90%. Number of deliveries was correlated with age at diagnosis (r=0.35, P<0.001).These two variables together retained significant in more than 90% of samples.Conclusion:We addressed two important methodological issues using a cohort of breast cancer patients. The algorithm combines multiple imputation of missing data and bootstrapping and has the potential to be applied in all kind of regression modelling exercises so as to address internal validity of models. &nbsp

Directory of Open Access Journals

Prevention of Disease Complications Through Diagnostic Models: How to Tackle the Problem of Missing Data?

Author: H Faramarzi
M Marzban
MR Baneshi
Publication venue: Tehran University of Medical Sciences
Publication date: 01/01/2012
Field of study

Background: Diagnostic models are frequently used to assess the role of risk factors on disease complications, and therefore to avoid them. Missing data is an issue that challenges the model making. The aim of this study was to develop a diagnostic model to predict death in HIV/ AIDS patients when missing data exist.Methods: HIV patients (n=1460) referred to Voluntary Consoling and Testing Center (VCT) of Shiraz southern Iran during 2004-2009 were recruited. Univariate association between variables and death was assessed. Only variables which had univariate P< 0.25 were selected to be offered to the Multifactorial models. First, patients with missing data on candidate variables were deleted (C-C model). Then, applying Multivariable Imputation via Chained Equations (MICE), missing data were imputed. Logistic regression was fitted to C-C and imputed data sets (MICE model). Models were compared in terms of number of variables retained in the final model, width of confidence intervals, and discrimination ability.Result: About 22% of data were lost in C-C model. Number of variables retained in the C-C and MICE models was 2 and 6 respectively. Confidence Intervals (C.I.) corresponding to C-C model was wider than that of MICE. The MICE model showed greater discrimination ability than C-C model (70% versus 64%).Conclusion: The -C analysis resulted to loss of power and wide CI's. Once missing data were imputed, more variables reached significance level and C.I.'s were narrower. Therefore, we do recommend the application of the imputation method for handling missing data

Directory of Open Access Journals

On the use of fractional polynomial models to assess preventive aspect of variables: An example in prevention of mortality following HIV infection

Author: Baneshi MR
Law M
Nakhaee F
Publication venue
Publication date: 01/04/2013
Field of study

Background: Identification of disease risk factors can help in the prevention of diseases. In assessing the predictive value of continuous variables, a routine procedure is to categorize the factors. This yield to inability to detect nonlinear relationship, if exist. Multivariate fractional polynomial (MFP) modeling is a flexible method to reveal nonlinear associations. We aim to demonstrate the impact of choice of risk function on the significance of variables. Methods: We selected 6508 HIVinfected persons registered in the Australia National HIV Registry between 1980 and 2003 to assess the predictors associated with the risk of death after HIV infection prior to AIDS. First, CD4 count as a categorical factor with three other categorical variables (age, sex, and HIV exposure category) was entered into the Cox regression model. Second, CD4 counts as a continuous variable along with other categorical variables were entered into the fractional polynomial (FP) model. Results: Both the Cox and FP models showed age ≥ 40 years and hemophiliac patients were significantly associated with increased risk of death. In the categorized model, the CD4 variable did not reach the significance level. However, this variable was highly significant in the MFP model. The FP model showed slightly better performance in terms of discrimination ability and goodness of fit. Conclusions: The FP model is a flexible method in detecting the predictive effect of continuous variables. This method enhances the ability to assess the predictive ability of variables and improves model performance

UNSWorks

Comparison of conventional risk factors in middle-aged versus elderly diabetic and nondiabetic patients with myocardial infarction: prediction with decision-analytic model

Author: Baneshi Mohammad Reza
Mahmoodi MR
Rastegari Azam
Publication venue
Publication date: 01/01/2015
Field of study

BACKGROUND: We sought to predict occurrence of myocardial infarction (MI) by means of a classification and regression tree (CART) model by conventional risk factors in middle-aged versus elderly (age ⩾65years) diabetic and nondiabetic patients from the Modares Heart Study. METHOD: A total of 469 patients were randomly selected and categorized into two groups according to clinical diabetes status. Group I consisted of 238 diabetic patients and group II consisted of 231 nondiabetic patients. Our population was MI positive. The outcome investigated was diabetes mellitus. We used a decision-analytic model to predict the diagnosis of patients with suspected MI. RESULTS: We constructed 4 predictive patterns using 12 input variables and 1 output variable in terms of their sensitivity, specificity and risk. The differences among patterns were due to inclusion of predictor variables. The CART model suggested different variables of hypertension, mean cell volume, fasting blood sugar, cholesterol, triglyceride and uric acid concentration based on middle-aged and elderly patients at high risk for MI. Levels of biochemical measurements identified as best risk cutoff points. In evaluating the precision of different patterns, sensitivity and specificity were 47.9-84.0% and 56.3-93.0%, respectively. CONCLUSIONS: The CART model is capable of symbolizing interpretable clinical data for confirming and better prediction of MI occurrence in clinic or in hospital. Therefore, predictor variables in pattern could affect the outcome based on age group variable. Hyperglycemia, hypertension, hyperlipidemia and hyperuricemia were serious predictors for occurrence of MI in diabetics

Directory of Open Access Journals

PubMed Central

Simorgh Research Repository

Can we Replace Arterial Blood Gas Analysis by Pulse Oximetry in Neonates with Respiratory Distress Syndrome, who are Treated According to INSURE Protocol?

Author: Bahman Bijari B
Baneshi MR
Niknafs P
Norouzi E
Publication venue
Publication date: 01/01/2015
Field of study

Neonates with respiratory distress syndrome (RDS), who are treated according to INSURE protocol; require arterial blood gas (ABG) analysis to decide on appropriate management. We conducted this study to investigate the validity of pulse oximetry instead of frequent ABG analysis in the evaluation of these patients. From a total of 193 blood samples obtained from 30 neonates <1500 grams with RDS, 7.2% were found to have one or more of the followings: acidosis, hypercapnia, or hypoxemia. We found that pulse oximetry in the detection of hyperoxemia had a good validity to appropriately manage patients without blood gas analysis. However, the validity of pulse oximetry was not good enough to detect acidosis, hypercapnia, and hypoxemia

Simorgh Research Repository