Search CORE

161 research outputs found

An Algorithm for Generating Individualized Treatment Decision Trees and Random Forests

Author: Haoda Fu (5039792)
Hua Zhou (50017)
Jin Zhou (90734)
Kevin Doubleday (5039795)
Publication venue
Publication date: 29/03/2018
Field of study

With new treatments and novel technology available, precision medicine has become a key topic in the new era of healthcare. Traditional statistical methods for precision medicine focus on subgroup discovery through identifying interactions between a few markers and treatment regimes. However, given the large scale and high dimensionality of modern datasets, it is difficult to detect the interactions between treatment and high-dimensional covariates. Recently, novel approaches have emerged that seek to directly estimate individualized treatment rules (ITR) via maximizing the expected clinical reward by using, for example, support vector machines (SVM) or decision trees. The latter enjoys great popularity in clinical practice due to its interpretability. In this article, we propose a new reward function and a novel decision tree algorithm to directly maximize rewards. We further improve a single tree decision rule by an ensemble decision tree algorithm, ITR random forests. Our final decision rule is an average over single decision trees and it is a soft probability rather than a hard choice. Depending on how strong the treatment recommendation is, physicians can make decisions based on our model along with their own judgment and experience. Performance of ITR forest and tree methods is assessed through simulations along with applications to a randomized controlled trial (RCT) of 1385 patients with diabetes and an EMR cohort of 5177 patients with diabetes. ITR forest and tree methods are implemented using statistical software R (<a href="https://github.com/kdoub5ha/ITR.Forest" target="_blank">https://github.com/kdoub5ha/ITR.Forest</a>). Supplementary materials for this article are available online.</p

Crossref

eScholarship - University of California

FigShare

Regression Models for Multivariate Count Data

Author: Hua Zhou (50017)
Jin Zhou (90734)
Wei Sun (93580)
Yiwen Zhang (512835)
Publication venue
Publication date: 01/01/2017
Field of study

Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of overdispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly because they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data. Supplementary materials for this article are available online.</p

PubMed Central

eScholarship - University of California

FigShare

Effects of DMP exposure on Embryo hatching rates (%) of abalone gametes.

Author: Jin Zhou (90734)
Xiao-Shan Zhu (341719)
Zhong-Hua Cai (341721)
Publication venue
Publication date
Field of study

Values are means±SD, and different letters denote values that are significantly different (P<0.05) and the columns that share the same letter are not significantly different (P>0.05).</p

FigShare

Prediction accuracy in MESA using lead SNPs vs. SNPs identified in C/J analysis at different p-value thresholds.

Author: Jin Zhou (90734)
Nathan E. Wineinger (170965)
Yann C. Klimentidis (147995)
Publication venue
Publication date
Field of study

Prediction accuracy in MESA using lead SNPs vs. SNPs identified in C/J analysis at different p-value thresholds.</p

FigShare

Prediction accuracy in MESA at 3 loci with additional detected SNPs at the 5×10−8 threshold.

Author: Jin Zhou (90734)
Nathan E. Wineinger (170965)
Yann C. Klimentidis (147995)
Publication venue
Publication date
Field of study

Prediction accuracy in MESA at 3 loci with additional detected SNPs at the 5×10−8 threshold.</p

FigShare

Effects of DMP exposure on embryo abnormality rates (%) of abalone gametes.

Author: Jin Zhou (90734)
Xiao-Shan Zhu (341719)
Zhong-Hua Cai (341721)
Publication venue
Publication date
Field of study

Values are means±SD, and different letters denote values that are significantly different (P<0.05) and the columns that share the same letter are not significantly different (P>0.05).</p

FigShare

Light microscope images of an abalone sperm (A) and egg (B).

Author: Jin Zhou (90734)
Xiao-Shan Zhu (341719)
Zhong-Hua Cai (341721)
Publication venue
Publication date
Field of study

The full-length of the sperm is approximately 45 µm, and the diameter of the egg is approximately 150 µm.</p

FigShare

Effects of DMP exposure on total lipid levels of eggs and ATPase activities of sperm.

Author: Jin Zhou (90734)
Xiao-Shan Zhu (341719)
Zhong-Hua Cai (341721)
Publication venue
Publication date
Field of study

(A) total lipid levels of eggs. (B) ATPase activities of sperm. Each bar represents the mean ± SD. Data are representative of three independent experiments. Significant differences (P<0.05, one-way ANOVA) in total lipid levels and ATPase activities between the experimental and control groups are indicated with different letter.</p

FigShare

Effects of DMP exposure on fertilization rates (%) of abalone gametes.

Author: Jin Zhou (90734)
Xiao-Shan Zhu (341719)
Zhong-Hua Cai (341721)
Publication venue
Publication date
Field of study

The percentages of fertilization in different protocols were determined by counting approximately 100–150 randomly sampled eggs. Data are means±SD of three tests. Different letters denote statistically significant differences between control and treatment groups determined by one-way ANOVA (bP<0.05, cP<0.01).</p

FigShare

Variance explained at various p-value thresholds in the MESA validation dataset by the collection of individual SNPs on the liability scale, variance explained by, and model fit of, the weighted GRS, using Nagelkerke's R2, and AIC, respectively.

Author: Jin Zhou (90734)
Nathan E. Wineinger (170965)
Yann C. Klimentidis (147995)
Publication venue
Publication date
Field of study

Variance explained at various p-value thresholds in the MESA validation dataset by the collection of individual SNPs on the liability scale, variance explained by, and model fit of, the weighted GRS, using Nagelkerke's R2, and AIC, respectively.</p

FigShare

An Algorithm for Generating Individualized Treatment Decision Trees and Random Forests

Regression Models for Multivariate Count Data

Effects of DMP exposure on Embryo hatching rates (%) of abalone gametes.

Prediction accuracy in MESA using lead SNPs vs. SNPs identified in C/J analysis at different p-value thresholds.

Prediction accuracy in MESA at 3 loci with additional detected SNPs at the 5×10<sup>−8</sup> threshold.

Effects of DMP exposure on embryo abnormality rates (%) of abalone gametes.

Light microscope images of an abalone sperm (A) and egg (B).

Effects of DMP exposure on total lipid levels of eggs and ATPase activities of sperm.

Effects of DMP exposure on fertilization rates (%) of abalone gametes.

Variance explained at various p-value thresholds in the MESA validation dataset by the collection of individual SNPs on the liability scale, variance explained by, and model fit of, the weighted GRS, using Nagelkerke's R<sup>2</sup>, and AIC, respectively.