Search CORE

5,028 research outputs found

An integrative analysis of cancer gene expression studies using Bayesian latent factor modeling

Author: Chen Julia Ling-Yu
Chi Jen-Tsan
Merl Daniel
West Mike
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 06/10/2010
Field of study

We present an applied study in cancer genomics for integrating data and inferences from laboratory experiments on cancer cell lines with observational data obtained from human breast cancer studies. The biological focus is on improving understanding of transcriptional responses of tumors to changes in the pH level of the cellular microenvironment. The statistical focus is on connecting experimentally defined biomarkers of such responses to clinical outcome in observational studies of breast cancer patients. Our analysis exemplifies a general strategy for accomplishing this kind of integration across contexts. The statistical methodologies employed here draw heavily on Bayesian sparse factor models for identifying, modularizing and correlating with clinical outcome these signatures of aggregate changes in gene expression. By projecting patterns of biological response linked to specific experimental interventions into observational studies where such responses may be evidenced via variation in gene expression across samples, we are able to define biomarkers of clinically relevant physiological states and outcomes that are rooted in the biology of the original experiment. Through this approach we identify microenvironment-related prognostic factors capable of predicting long term survival in two independent breast cancer datasets. These results suggest possible directions for future laboratory studies, as well as indicate the potential for therapeutic advances though targeted disruption of specific pathway components.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS261 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Recommended from our members

Anatomic Fat Depots and Coronary Plaque Among Human Immunodeficiency Virus-Infected and Uninfected Men in the Multicenter AIDS Cohort Study.

Author: Brown Todd T
Budoff Matthew
Jacobson Lisa P
Kingsley Lawrence
Li Xiuhong
McKibben Rebeccah
Palella Frank J
Post Wendy S
Witt Mallory D
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

Methods. In a cross-sectional substudy of the Multicenter AIDS Cohort Study, noncontrast cardiac computed tomography (CT) scanning for coronary artery calcium (CAC) scoring was performed on all men, and, for men with normal renal function, coronary CT angiography (CTA) was performed. Associations between fat depots (visceral adipose tissue [VAT], abdominal subcutaneous adipose tissue [aSAT], and thigh subcutaneous adipose tissue [tSAT]) with coronary plaque presence and extent were assessed with logistic and linear regression adjusted for age, race, cardiovascular disease (CVD) risk factors, body mass index (BMI), and human immunodeficiency virus (HIV) parameters. Results. Among HIV-infected men (n = 597) but not HIV-uninfected men (n = 343), having greater VAT was positively associated with noncalcified plaque presence (odds ratio [OR] = 1.04, P < .05), with a significant interaction (P < .05) by HIV serostatus. Human immunodeficiency virus-infected men had lower median aSAT and tSAT and greater median VAT among men with BMI <25 and 25-29.9 kg/m(2). Among HIV-infected men, VAT was positively associated with presence of coronary plaque on CTA after adjustment for CVD risk factors (OR = 1.04, P < .05), but not after additional adjustment for BMI. There was an inverse association between aSAT and extent of total plaque among HIV-infected men, but not among HIV-uninfected men. Lower tSAT was associated with greater CAC and total plaque score extent regardless of HIV serostatus. Conclusions. The presence of greater amounts of VAT and lower SAT may contribute to increased risk for coronary artery disease among HIV-infected persons

eScholarship - University of California

Recommended from our members

Narrowed Gaps and Persistent Challenges: Examining Rural-Nonrural Disparities in Postsecondary Outcomes over Time

Author: Kimball Ezekiel
Kommers Suzan
Manly Catherine A.
Wells Ryan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2019
Field of study

Empirical studies have concluded that rural students experience lower rates of college enrollment and degree completion compared to their nonrural peers, but this literature needs to be expanded and updated for a continually changing context. This article examines the rural-nonrural disparities in students’ postsecondary trajectories, influences, and outcomes. By comparing results to past research using similar national data and an identical design, we are able to examine change over time. Results show narrowed gaps from the 1990s into the 2000s, but with rural students still facing persistent challenges and experiencing lower average rates of college enrollment and degree completion

ScholarWorks@UMass Amherst

Interpretable prognostic modeling of endometrial cancer

Author: Bützow Ralf
Loukovaara Mikko
Pasanen Annukka
Tang Jing
Zagidullin Bulat
Publication venue
Publication date: 01/12/2022
Field of study

Endometrial carcinoma (EC) is one of the most common gynecological cancers in the world. In this work we apply Cox proportional hazards (CPH) and optimal survival tree (OST) algorithms to the retrospective prognostic modeling of disease-specific survival in 842 EC patients. We demonstrate that linear CPH models are preferred for the EC risk assessment based on clinical features alone, while interpretable, non-linear OST models are favored when patient profiles can be supplemented with additional biomarker data. We show how visually interpretable tree models can help generate and explore novel research hypotheses by studying the OST decision path structure, in which L1 cell adhesion molecule expression and estrogen receptor status are correctly indicated as important risk factors in the p53 abnormal EC subgroup. To aid further clinical adoption of advanced machine learning techniques, we stress the importance of quantifying model discrimination and calibration performance in the development of explainable clinical prediction models.Peer reviewe

Directory of Open Access Journals

PubMed Central

Helsingin yliopiston digitaalinen arkisto

Clinical prediction modelling in oral health: A review of study quality and empirical examples of model development

Author: Du Mi
Publication venue
Publication date: 01/01/2021
Field of study

Background Substantial efforts have been made to improve the reproducibility and reliability of scientific findings in health research. These efforts include the development of guidelines for the design, conduct and reporting of preclinical studies (ARRIVE), clinical trials (ROBINS-I, CONSORT), observational studies (STROBE), and systematic reviews and meta-analyses (PRISMA). In recent years, the use of prediction modelling has increased in the health sciences. Clinical prediction models use information at the individual patient level to estimate the probability of a health outcome(s). Such models offer the potential to assist in clinical decision-making and to improve medical care. Guidelines such as PROBAST (Prediction model Risk Of Bias Assessment Tool) have been recently published to further inform the conduct of prediction modelling studies. Related guidelines for the reporting of these studies, such as TRIPOD (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis) instrument, have also been developed. Since the early 2000s, oral health prediction models have been used to predict the risk of various types of oral conditions, including dental caries, periodontal diseases and oral cancers. However, there is a lack of information on the methodological quality and reporting transparency of the published oral health prediction modelling studies. As a consequence, and due to the unknown quality and reliability of these studies, it remains unclear to what extent it is possible to generalise their findings and to replicate their derived models. Moreover, there remains a need to demonstrate the conduct of prediction modelling studies in oral health field following the contemporary guidelines. This doctoral project addresses these issues using two systematic reviews and two empirical analyses. This thesis is the first comprehensive and systematic project reviewing the study quality and demonstrating the use of registry data and longitudinal cohorts to develop clinical prediction models in oral health. Aims • To identify and examine the quality of existing prediction modelling studies in the major fields of oral health.• To demonstrate the conduct and reporting of a prediction modelling study following current guidelines, incorporating machine learning algorithms and accounting for multiple sources of biases. Methods As one of the most prevalent oral conditions, chronic periodontitis was chosen as the exemplar pathology for the first part of this thesis. A systematic review was conducted to investigate the existing prediction models for the incidence and progression of this condition. Based upon this initial overview, a more comprehensive critical review was conducted to assess the methodological quality and completeness of reporting for prediction modelling studies in the field of oral health. The risk of bias in the existing literature was assessed using the PROBAST criteria, and the quality of study reporting was measured in accordance with the TRIPOD guidelines. Following these two reviews, this research project demonstrated the conduct and reporting of a clinical prediction modelling study using two empirical examples. Two types of analyses that are commonly used for two different types of outcome data were adopted: survival analysis for censored outcomes and logistic regression analysis for binary outcomes. Models were developed to 1) predict the three- and five-year disease-specific survival of patients with oral and pharyngeal cancers, based on 21,154 cases collected by a large cancer registry program in the US, the Surveillance, Epidemiology and End Results (SEER) program, and 2) to predict the occurrence of acute and persistent pain following root canal treatment, based on the electronic dental records of 708 adult patients collected by the National Practice-Based Research Network. In these two case studies, all prediction models were developed in five steps: (i) framing the research question; (ii) data acquisition and pre-processing; (iii) model generation; (iv) model validation and performance evaluation; and (v) model presentation and reporting. In accordance with the PROBAST recommendations, the risk of bias during the modelling process was reduced in the following aspects: • In the first case study, three types of biases were taken into account: (i) bias due to missing data was reduced by adopting compatible methods to conduct imputation; (ii) bias due to unmeasured predictors was tested by sensitivity analysis; and (iii) bias due to the initial choice of modelling approach was addressed by comparing tree-based machine learning algorithms (survival tree, random survival forest and conditional inference forest) with the traditional statistical model (Cox regression). • In the second case study, the following strategies were employed: (i) missing data were addressed by multiple imputation with missing indicator methods; (ii) a multilevel logistic regression approach was adopted for model development in order to fit Table of Contents xi the hierarchical structure of the data; (iii) model complexity was reduced using the Least Absolute Shrinkage and Selection Operator (LASSO) for predictor selection; and (iv) the models’ predictive performance was evaluated comprehensively by using the Area Under the Precision Recall Curve (AUPRC) in addition to the Area Under the Receiver Operating Characteristic curve (AUROC); (v) finally, and most importantly, given the existing criticism in the research community concerning the gender-based and racial bias in risk prediction models, we compared the models’ predictive performance built with different sets of predictors (including a clinical set, a sociodemographic set and a combination of both, the ‘general’ set). Results The first and second review studies indicated that, in the field of oral health, the popularity of multivariable prediction models has increased in recent years. Bias and variance are two components of the uncertainty (e.g., the mean squared error) in model estimation. However, the majority of the existing studies did not account for various sources of bias, such as measurement error and inappropriate handling of missing data. Moreover, non-transparent reporting and lack of reproducibility of the models were also identified in the existing oral health prediction modelling studies. These findings provided motivation to conduct two case studies aimed at demonstrating adherence to the contemporary guidelines and to best practice. In the third study, comparable predictive capabilities between Cox regression and the non-parametric tree-based machine learning algorithms were observed for predicting the survival of patients with oral and pharyngeal cancers. For example, the C-index for a Cox model and a random survival forest in predicting three-year survival were 0.82 and 0.84, respectively. A novelty of this study was the development of an online calculator designed to provide an open and transparent estimation of patients’ survival probability for up to five years after diagnosis. This calculator has clinical translational potential and could aid in patient stratification and treatment planning, at least in the context of ongoing research. In addition, the transparent reporting of this study was achieved by following the TRIPOD checklist and sharing all data and codes. In the fourth study, LASSO regression suggested that pre-treatment clinical factors were important in the development of one-week and six-month postoperative pain following root canal treatment. Among all the developed multilevel logistic models, models with a clinical set of predictors yielded similar predictive performance to models with a general set of predictors, while the models with sociodemographic predictors showed the weakest predictive ability. For example, for predicting one-week postoperative pain, the AUROC for models with clinical, sociodemographic and general predictors were 0.82, 0.68 and 0,84, respectively, and the AUPRC were 0.66, 0.40 and 0.72, respectively. Conclusion The significance of this research project is twofold. First, prediction models have been developed for potential clinical use in the context of various oral conditions. Second, this research represents the first attempt to standardise the conduct of this type of studies in oral health research. This thesis presents three conclusions: 1) Adherence to contemporary best practice guidelines such as PROBAST and TRIPOD is limited in the field of oral health research. In response, this PhD project disseminates these guidelines and leverages their advantages to develop effective prediction models for use in dentistry and oral health. 2) Use of appropriate procedures, accounting for and adapting to multiple sources of bias in model development, produces predictive tools of increased reliability and accuracy that hold the potential to be implemented in clinical practice. Therefore, for future prediction modelling research, it is important that data analysts work towards eliminating bias, regardless of the areas in which the models are employed. 3) Machine learning algorithms provide alternatives to traditional statistical models for clinical prediction purposes. Additionally, in the presence of clinical factors, sociodemographic characteristics contribute less to the improvement of models’ predictive performance or to providing cogent explanations of the variance in the models, regardless of the modelling approach. Therefore, it is timely to reconsider the use of sociodemographic characteristics in clinical prediction modelling research. It is suggested that this is a proportionate and evidence based strategy aimed at reducing biases in healthcare risk prediction that may be derived from gender and racial characteristics inherent in sociodemographic data sets.Thesis (Ph.D.) -- University of Adelaide, School of Public Health, 202

Adelaide Research & Scholarship

Use of radiomic data to improve imputation of HPV (p16) status in oropharyngeal cancer

Author: Lascorz Guiu Aleix
Publication venue: Universitat Politècnica de Catalunya
Publication date: 24/10/2019
Field of study

The incidence of oropharyngeal cancer has been steadily increasing during the past decades. This increase is linked with human papillomavirus, one of the most common sexually transmitted diseases in Canada and worldwide. Recent studies have shown the importance of using p16 testing to assess the HPV status of all oropharyngeal cancer patients on diagnostic. However, that practice was not common during early 2000, making historical data flawed. Many imputation models have been built to retroactively predict the HPV status of oropharyngeal cancer patients that were not tested. This models are based on clinical data, which is easy to store and analyze. However, recent advancements in the field of radiomics have enabled the use of CT scans obtained from patients to build models for cancer behavior. In this study, we take a novel approach to HPV status imputation by building machine learning models that utilize not only clinical data but also imaging features, aiming to show a significant improvement over classical models. The increase of performance between state of the art clinical models and our models will be assessed through the use of the RADCURE dataset from the Princess MargaretOutgoin

UPCommons. Portal del coneixement obert de la UPC