161 research outputs found

    An Algorithm for Generating Individualized Treatment Decision Trees and Random Forests

    No full text
    <p>With new treatments and novel technology available, precision medicine has become a key topic in the new era of healthcare. Traditional statistical methods for precision medicine focus on subgroup discovery through identifying interactions between a few markers and treatment regimes. However, given the large scale and high dimensionality of modern datasets, it is difficult to detect the interactions between treatment and high-dimensional covariates. Recently, novel approaches have emerged that seek to directly estimate individualized treatment rules (ITR) via maximizing the expected clinical reward by using, for example, support vector machines (SVM) or decision trees. The latter enjoys great popularity in clinical practice due to its interpretability. In this article, we propose a new reward function and a novel decision tree algorithm to directly maximize rewards. We further improve a single tree decision rule by an ensemble decision tree algorithm, ITR random forests. Our final decision rule is an average over single decision trees and it is a soft probability rather than a hard choice.   Depending on how strong the treatment recommendation is, physicians can make decisions based on our model along with their own judgment and experience.  Performance of ITR forest and tree methods is assessed through simulations along with applications to a randomized controlled trial (RCT) of 1385 patients with diabetes and an EMR cohort of 5177 patients with diabetes. ITR forest and tree methods are implemented using statistical software R (<i><a href="https://github.com/kdoub5ha/ITR.Forest" target="_blank">https://github.com/kdoub5ha/ITR.Forest</a></i>). Supplementary materials for this article are available online.</p

    Regression Models for Multivariate Count Data

    No full text
    <p>Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of overdispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly because they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. Supplementary materials for this article are available online.</p

    Effects of DMP exposure on Embryo hatching rates (%) of abalone gametes.

    No full text
    <p>Values are means±SD, and different letters denote values that are significantly different (P<0.05) and the columns that share the same letter are not significantly different (P>0.05).</p

    Prediction accuracy in MESA using lead SNPs vs. SNPs identified in C/J analysis at different p-value thresholds.

    No full text
    <p>Prediction accuracy in MESA using lead SNPs vs. SNPs identified in C/J analysis at different p-value thresholds.</p

    Prediction accuracy in MESA at 3 loci with additional detected SNPs at the 5×10<sup>−8</sup> threshold.

    No full text
    <p>Prediction accuracy in MESA at 3 loci with additional detected SNPs at the 5×10<sup>−8</sup> threshold.</p

    Effects of DMP exposure on embryo abnormality rates (%) of abalone gametes.

    No full text
    <p>Values are means±SD, and different letters denote values that are significantly different (P<0.05) and the columns that share the same letter are not significantly different (P>0.05).</p

    Light microscope images of an abalone sperm (A) and egg (B).

    No full text
    <p>The full-length of the sperm is approximately 45 µm, and the diameter of the egg is approximately 150 µm.</p

    Effects of DMP exposure on total lipid levels of eggs and ATPase activities of sperm.

    No full text
    <p>(A) total lipid levels of eggs. (B) ATPase activities of sperm. Each bar represents the mean ± SD. Data are representative of three independent experiments. Significant differences (P<0.05, one-way ANOVA) in total lipid levels and ATPase activities between the experimental and control groups are indicated with different letter.</p

    Effects of DMP exposure on fertilization rates (%) of abalone gametes.

    No full text
    <p>The percentages of fertilization in different protocols were determined by counting approximately 100–150 randomly sampled eggs. Data are means±SD of three tests. Different letters denote statistically significant differences between control and treatment groups determined by one-way ANOVA (<sup>b</sup>P<0.05, <sup>c</sup>P<0.01).</p

    Variance explained at various p-value thresholds in the MESA validation dataset by the collection of individual SNPs on the liability scale, variance explained by, and model fit of, the weighted GRS, using Nagelkerke's R<sup>2</sup>, and AIC, respectively.

    No full text
    <p>Variance explained at various p-value thresholds in the MESA validation dataset by the collection of individual SNPs on the liability scale, variance explained by, and model fit of, the weighted GRS, using Nagelkerke's R<sup>2</sup>, and AIC, respectively.</p
    corecore