125 research outputs found
Statistical models in prognostic modelling with many skewed variables and missing data: a case study in breast cancer
Prognostic models have clinical appeal to aid therapeutic decision making. In the
UK, the Nottingham Prognostic Index (NPI) has been used, for over two decades, to
inform patient management. However, it has been commented that NPI is not
capable of identifying a subgroup of patients with a prognosis so good that adjuvant
therapy with potential harmful side effects can be withheld safely.
Tissue Microarray Analysis (TMA) now makes possible measurement of biological
tissue microarray features of frozen biopsies from breast cancer tumours. These give
an insight to the biology of tumour and hence could have the potential to enhance
prognostic modelling. I therefore wished to investigate whether biomarkers can add
value to clinical predictors to provide improved prognostic stratification in terms of
Recurrence Free Survival (RFS).
However, there are very many biomarkers that could be measured, they usually
exhibit skewed distribution and missing values are common. The statistical issues
raised are thus number of variables being tested, form of the association, imputation
of missing data, and assessment of the stability and internal validity of the model.
Therefore the specific aim of this study was to develop and to demonstrate
performance of statistical modelling techniques that will be useful in circumstances
where there is a surfeit of explanatory variables and missing data; in particular to
achieve useful and parsimonious models while guarding against instability and
overfitting. I also sought to identify a subgroup of patients with a prognosis so good that a decision can be made to avoid adjuvant therapy. I aimed to provide statistically
robust answers to a set of clinical question and develop strategies to be used in such
data sets that would be useful and acceptable to clinicians.
A unique data set of 401 Estrogen Receptor positive (ER+) tamoxifen treated breast
cancer patients with measurement for a large panel of biomarkers (72 in total) was
available. Taking a statistical approach, I applied a multi-faceted screening process to
select a limited set of potentially informative variables and to detect the appropriate
form of the association, followed by multiple imputations of missing data and
bootstrapping. In comparison with the NPI, the final joint model derived assigned
patients into more appropriate risk groups (14% of recurred and 4% of non-recurred
cases). The actuarial 7-year RFS rate for patients in the lowest risk quartile was 95%
(95% C.I.: 89%, 100%).
To evaluate an alternative approach, biological knowledge was incorporated into the
process of model development. Model building began with the use of biological
expertise to divide the variables into substantive biomarker sets on the basis of
presumed role in the pathway to cancer progression. For each biomarker family, an
informative and parsimonious index was generated by combining family variables, to
be offered to the final model as intermediate predictor. In comparison with NPI,
patients into more appropriate risk groups (21% of recurred and 11% of non-recurred
patients). This model identified a low-risk group with 7-year RFS rate at 98% (95%
C.I.: 96%, 100%)
Clinical Environment Assessment Based on DREEM Model from the Viewpoint of Interns and Residents of Hospitals Affiliated with Kerman University of Medical Sciences, Iran
Background & Objective: Clinical environments have a crucial role on medical students' training. Thus, the aim of this study was to assess clinical environments based on the (Dundee Ready Education Environment Measure) DREEM model from the viewpoint of interns and residents in hospitals affiliated with Kerman University of Medical Sciences, Iran, in 2012. Methods: This was a descriptive-analytic study. The data collection tool was the DREEM Questionnaire with 50 questions (5-point Likert scale) in the 5 domains of learning, teachers, educational environment, student's academic self-perceptions, and student's social self-perceptions. The study environment consisted of 4 main wards (internal, surgical, pediatrics, and gynecology) of hospitals affiliated with Kerman University of Medical Sciences. The study subjects consisted of 63 interns and 73 residents. Data was analyzed in SPSS software using Students' t-test and ANOVA. Results: Mean score of perception of educational environment in interns was 161.17 ± 22.30 and in residents was 157.45 ± 21.14. The comparison of different areas of clinical environment evaluation only showed a significant difference between the two groups in the area of student's social self-perceptions (P < 0.05). The interns' score was higher than that of the residents. No significant differences were observed between hospitals and the studied wards. Conclusion: The students' perceptions of their educational environment in clinical wards were desirable. Despite different literature's recommendation of using DREEM in order to evaluate weaknesses and strengths of clinical environments, the concurrent use of other methods and instruments for the assessment of the efficacy of this questionnaire is recommended. Key Words: DREEM model, Assessment, Residents, Interns, Ira
Estimating the Visibility Rate of Alcohol Consumption: A Case Study in Shiraz, Iran
Background: Network Scale Up (NSU) is applied in many settings to estimate the size of hidden populations.The visibility of alcohol consumption - as a hidden behavior - in Iran has not been yet set. Our aim is to estimatethe visibility factor (VF) of alcohol consumption in Iran which is an Islamic country in the Middle East.Methods: Ninety persons who had a history of alcohol consumption were recruited. Relationships in networkwere aligned in three main subgroups: immediate family, extended family, and non-family. According to thegame of contact methodology, participants answered questions about total and aware number of personsthey know in each relationship category. VF was calculated by dividing total number of people aware aboutthe respondent’s alcohol consumption by total number of respondent’s social network. The 95% confidenceintervals (CIs) were computed through bootstrapping.Findings: The mean and standard deviation (SD) of participants’ age was 32.9 ± 10.2, the sex ratio was 3.Overall VF (95% CI) was 40% (33% to 47%). VF was estimated at 44% and 23% among men and women’snetwork, respectively. The immediate family was the highest informed group, followed by non-family andextended family members.Conclusion: The visibility of alcohol consumption in Iran was not high. This is due to religious and legalprohibitions around i
A Guide to Selecting the Appropriate Statistical Tests for Proposals and Articles in Medical Sciences
Background & Objective: The main purpose of medical researches is to answer a research question or to solve a problem to promote the health of a society. The first objective is to answer the research question correctly with minimal errors. The second objective is the publication of the results in order to generalize them to a population and use in a wider dimension. To achieve these objectives, using biostatistics is necessary. Despite the importance of biostatistics in medical research, researchers have limited understanding of it or due to its complications they refrain from its use. Statistics help the researcher in different levels of research including writing a proposal and interpretation of other papers. Moreover, biostatisticians and epidemiologists also play a very important role in the preparation of manuscripts for publication. The present article has eloquently described the most important statistical tests in medical research with applied examples.
Keywords
Selecting statistical tests Parametric tests Non-parametric test
Estimation of the Active Network Size of Kermanian Males
Background: Estimation of the size of hidden and hard-to-reach sub-populations, such as drug-abusers, is a very important but difficult task. Network scale up (NSU) is one of the indirect size estimation techniques, which relies on the frequency of people belonging to a sub-population of interest among the social network of a random sample of the general population. In this study, we estimated the social network size of Kermanian males (C) as one of the main prerequisites for using NSU. Methods: A 500 random sample of Kermanian males between 18 and 45 years old were interviewed. We asked the size of their active networks using direct questions. In addition, we received the frequency of six names from the vital registry office among Kermanian males, and we estimated C indirectly using the received frequencies and the frequency of these names among the networks of our sample. Findings: Although different methods showed quite different Cs between 100 and 350, the best estimation for C was 303, which means that on average each Kermanian male knows around 303 males between the age range of 18 and 45 years. The estimated C did not have any strong association with the demographic variables of our subjects. Conclusion: Using the estimated C we may use the NSU technique to assess the frequency of many important hidden sub-populations such as drug-abusers and those who have sexual contact with men and women. Keywords: Size estimation, Social network, Networking, Addiction, Hidden population, Hard to reach population
Application of Random Forest Survival Models to Increase Generalizability of Decision Trees: A Case Study in Acute Myocardial Infarction
Background. Tree models provide easily interpretable prognostic tool, but instable results. Two approaches to enhance the generalizability of the results are pruning and random survival forest (RSF). The aim of this study is to assess the generalizability of saturated tree (ST), pruned tree (PT), and RSF. Methods. Data of 607 patients was randomly divided into training and test set applying 10-fold cross-validation. Using training sets, all three models were applied. Using Log-Rank test, ST was constructed by searching for optimal cutoffs. PT was selected plotting error rate versus minimum sample size in terminal nodes. In construction of RSF, 1000 bootstrap samples were drawn from the training set. C-index and integrated Brier score (IBS) statistic were used to compare models. Results. ST provides the most overoptimized statistics. Mean difference between C-index in training and test set was 0.237. Corresponding figure in PT and RSF was 0.054 and 0.007. In terms of IBS, the difference was 0.136 in ST, 0.021 in PT, and 0.0003 in RSF. Conclusion. Pruning of tree and assessment of its performance of a test set partially improve the generalizability of decision trees. RSF provides results that are highly generalizable
Pattern of Alcohol Consumption among Men Consumers in Kerman, Iran, in 2014
Background: Alcohol consumption is a potential risk factor with acute and chronic health consequences and social impacts, which is more prominent among men. There is no precise statistics on the scope of alcohol consumption in Iran; however, there is some evidences showing an increasing trend, particularly among young generation. In order to evaluate the scope of this issue in Kerman, a large city in the south-east of Iran, this exploratory study was designed to approach a group of people having an experience of alcohol use.Methods: Samples were recruited to the study using a snowball sampling. 200 eligible subjects were questioned about the type of alcohol consumed, frequency of use, and other factors associated with alcohol consumption. In order to maximize the validity of responses, data were collected through self-administered questionnaires.Findings: The main alcoholic drinks consumed by individuals were the homemade distillates (46%), wine (22%), beer (14%), distilled spirits (11%), and medical alcohol (7%), respectively. The majority of individuals participating in the study (73%) used mostly homemade drinks; moreover, 63%, 26%, 9%, and 2% of subjects took monthly or less, two to four times a month, two to three times a week, and at least four times a week, respectively. Only 2% of the subjects were heavy consumers of alcoholic beverages.Conclusion: Due to the lack of control over homemade alcoholic beverages, its high levels can be a huge potential risk. Furthermore, it seems that both factors of access and price to be very effective in the amount of alcoholics taken by individuals. Therefore, further studies in this area will help to reduce the harm caused by alcohol consumption
Impact of verbal explanation on parental acceptance level of different behavior management techniques in dental office
BACKGROUND AND AIM: Parents’ attitudes towards different aspects of dentistry especially the use of behavior management techniques (BMTs) can greatly effect a child’s cooperation in a dental office. The present quasi-experimental study was conducted with the aim to assess the effect of a verbal explanation on parents’ acceptance level of the most common BMTs used in pediatric dentistry. METHODS: A videotaped presentation showing the 6 most commonly used BMTs in Iran was presented to 60 parents recruited by a convenient sampling method. Using visual analogue scale (VAS), the acceptance level of each BMT was measured before and after an explanation on the reasons of each BMT. Paired t-test, repeated measures analysis of variance (ANOVA) and independent t-test were used for statistical analysis of data. Significance level was set as 0.050. RESULTS: Giving a verbal explanation on BMTs had a statistically significant effect on the acceptance of BMTs. Tell-show-do (TSD) and hand-over-mouth (HOM) techniques achieved the highest and lowest mean scores of parental acceptance, respectively. The acceptance of physical restraint (P = 0.013) and parental presence/absence (PPA) (P = 0.015) of parents was obtained higher among men compared to women using t-test. CONCLUSION: Giving an explanation to parents while performing a BMT is effective in raising parents' acceptance of the technique. Non-invasive methods such as TSD and PPA of parents are the more favorable methods to parents. KEYWORDS: Pediatric Dentistry; Behavior Control; Parental Consen
Assessment of psychometric properties of Persian version of Perceived Socio-cultural Pressure Scale (PSPS)
Objectives: To generate the Persian version of the Perceived Socio-cultural Pressure Scale.
Methods: The study, done in Kerman, Iran, from November 2010 to February 2011, comprised 1200 volunteers. After
translation and back-translation, the questionnaire’s internal consistency, criterion and construct validity were
evaluated. Individual and global scores of the Perceived Socio-cultural Pressure Scale were assessed between
people with and without eating disorders.
Results: The mean scores for comparison were 14.7±6.64 and 21.84±10.65, giving a p-value of 0.0001. Internal and
inter-item consistency were acceptable. Item-total correlation ranged from 54% to 80%. Construct and criterion
validity of the scale were also acceptable.
Conclusion: The Persian version of the Perceived Socio-cultural Pressure Scale is a competent tool for use in the
general population and in individuals with eating disorders
Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis
Background: Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Methods: Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. Findings: The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. Conclusion: In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables. Keywords: Classification and regression trees, Logistic regression model, History of drug injection, Drug abus
- …