Search CORE

9 research outputs found

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Author: Baneshi MR
Talei AR
Publication venue
Publication date
Field of study

Background: Multifactorial regression models are frequently used in medicine to estimate survival rate of patients across risk groups. However, their results are not generalisable, if in the development of models assumptions required are not satisfied. Missing data is a common problem in pathology. The aim of this paper is to address the danger of exclusion of cases with missing data, and to highlight the importance of imputation of missing data before development of multifactorial models. Methods: This study was performed on 310 breast cancer patients diagnosed in Shiraz (Southern Iran). Performing a complete-case Cox regression model, a prognostic index was calculated so as to categorise the patients into 3 risk groups. Then, applying the Multivariate Imputation via Chained Equations (MICE) method, missing data were imputed 10 times. Using imputed data sets, modelling was performed to assign patients into risk groups. Estimated actuarial Overal Survival (OS) rates corresponding to analysis of complete-case and imputed data sets were compared. Results: Cases with at least one missing datum experienced a significantly better survival curve. Estimates derived analysing complete-case data, relative to imputed data sets, underestimated the OS rate in all risk groups. In addition confidence intervals were wider indicating loss in precision due to attrition in sample size and power. Conclusion: Results obtained highlighted the danger of exclusion of missing data. Imputation of missing data avoids biased estimates, increases the precision of estimates, and improves genralisability of results to other similar populations

Simorgh Research Repository

Does the Missing Data Imputation Method Affect the Composition and Performance of Prognostic Models?

Author: Baneshi MR
Talei AR
Publication venue
Publication date
Field of study

Background: We already showed the superiority of imputation of missing data (via Multivariable Imputation via Chained Equations (MICE) method) over exclusion of them; however, the methodology of MICE is complicated. Furthermore, easier imputation methods are available. The aim of this study was to compare them in terms of model composition and performance. Methods: Three hundreds and ten breast cancer patients were recruited. Four approaches were applied to impute missing data. First we adopted an ad hoc method in which missing data for each variable was replaced by the median of observed values. Then 3 likelihood-based approaches were used. In the regression imputation, a regression model compared the variable with missing data to the rest of the variables. The regression equation was used to fill the missing data. The Expectation Maximum (E-M) algorithm was implemented in which missing data and regression parameters were estimated iteratively until convergence of regression parameters. Finally, the MICE method was applied. Models developed were compared in terms of variables significantly contributed to the multifactorial analysis, sensitivity and specificity. Results: All candidate variables significantly contributed to the MICE model. However, grade of disease lost its effect in other three models. The MICE model showed the best performance followed by E-M model. Conclusion: Among imputation methods, final models were not the same, in terms of composition and performance. Therefore, modern imputation methods are recommended to recover the information

Simorgh Research Repository

Survival Models in Breast Cancer Patients

Author: Baneshi MR
Mehrabani D
Rajaeefard AR
Talei AR
Publication venue
Publication date
Field of study

Background: Breast cancer is the most prevalent malignancy among Iranian women. Five and ten year survival is one of the indicators used for evaluation of the quality of care after surgery. In this study, we used several survival models to determine risk factors, survival times and life expectancies of different types of surgery. Methods: This study was performed on 310 patients who underwent surgery during a ten years period. Logistic regression and Cox regression models were used to analyze the factors leading to death. The Kaplan-Meier method (non-parametric) was used to estimate the survival rate. The log-rank test was used to compare survival in different groups. To compare life expectancy of different types of surgery, we used the actuarial life table method. Results: Logistic regression showed that stage, grade, age and history of benign malignancy had significant relationship with death. Log-rank test showed that there was a significant difference between survival for patients with different stages, age and history of benign tumors. Cox regression model demonstrated that the variables of stage, grade, age and benign problems were the major risk factors. Actuarial life table model showed that the life expectancy for all patients was 10.03 years. This life expectancy in early stages of breast cancer for mastectomy and lumpectomy were 8.99 and 8.35 years, respectively, which was not significant. Conclusion: It can be concluded that the higher stage, grade, age and history of benign tumor were, the most important risk factors were correlated to mortality in breast cancer patients. This study showed that there was no significant difference between life expectancies of mastectomy and lumpectomy surgery

Simorgh Research Repository