11 research outputs found

    Heterogeneity Aware Random Forest for Drug Sensitivity Prediction

    Get PDF
    Abstract Samples collected in pharmacogenomics databases typically belong to various cancer types. For designing a drug sensitivity predictive model from such a database, a natural question arises whether a model trained on diverse inter-tumor heterogeneous samples will perform similar to a predictive model that takes into consideration the heterogeneity of the samples in model training and prediction. We explore this hypothesis and observe that ensemble model predictions obtained when cancer type is known out-perform predictions when that information is withheld even when the samples sizes for the former is considerably lower than the combined sample size. To incorporate the heterogeneity idea in the commonly used ensemble based predictive model of Random Forests, we propose Heterogeneity Aware Random Forests (HARF) that assigns weights to the trees based on the category of the sample. We treat heterogeneity as a latent class allocation problem and present a covariate free class allocation approach based on the distribution of leaf nodes of the model ensemble. Applications on CCLE and GDSC databases show that HARF outperforms traditional Random Forest when the average drug responses of cancer types are different

    Factors Predictive of the Status of Sentinel Lymph Nodes in Melanoma Patients from a Large Multicenter Database

    No full text
    Numerous predictive factors for cutaneous melanoma metastases to sentinel lymph nodes have been identified; however, few have been found to be reproducibly significant. This study investigated the significance of factors for predicting regional nodal disease in cutaneous melanoma using a large multicenter database. Seventeen institutions submitted retrospective and prospective data on 3463 patients undergoing sentinel lymph node (SLN) biopsy for primary melanoma. Multiple demographic and tumor factors were analyzed for correlation with a positive SLN. Univariate and multivariate statistical analyses were performed. Of 3445 analyzable patients, 561 (16.3%) had a positive SLN biopsy. In multivariate analysis of 1526 patients with complete records for 10 variables, increasing Breslow thickness, lymphovascular invasion, ulceration, younger age, the absence of regression, and tumor location on the trunk were statistically significant predictors of a positive SLN. These results confirm the predictive significance of the well-established variables of Breslow thickness, ulceration, age, and location, as well as consistently reported but less well-established variables such as lymphovascular invasion. In addition, the presence of regression was associated with a lower likelihood of a positive SLN. Consideration of multiple tumor parameters should influence the decision for SLN biopsy and the estimation of nodal metastatic disease risk
    corecore