187 research outputs found
A Provable Smoothing Approach for High Dimensional Generalized Regression with Applications in Genomics
In many applications, linear models fit the data poorly. This article studies
an appealing alternative, the generalized regression model. This model only
assumes that there exists an unknown monotonically increasing link function
connecting the response to a single index of explanatory
variables . The generalized regression model is flexible and
covers many widely used statistical models. It fits the data generating
mechanisms well in many real problems, which makes it useful in a variety of
applications where regression models are regularly employed. In low dimensions,
rank-based M-estimators are recommended to deal with the generalized regression
model, giving root- consistent estimators of . Applications of
these estimators to high dimensional data, however, are questionable. This
article studies, both theoretically and practically, a simple yet powerful
smoothing approach to handle the high dimensional generalized regression model.
Theoretically, a family of smoothing functions is provided, and the amount of
smoothing necessary for efficient inference is carefully calculated.
Practically, our study is motivated by an important and challenging scientific
problem: decoding gene regulation by predicting transcription factors that bind
to cis-regulatory elements. Applying our proposed method to this problem shows
substantial improvement over the state-of-the-art alternative in real data.Comment: 53 page
Empirical Likelihood Ratio Tests for Coe cients in High Dimensional Heteroscedastic Linear Models
This paper considers hypothesis testing problems for a low-dimensional coefficient vector in a high-dimensional linear model with heteroscedastic variance. Heteroscedasticity is a commonly observed phenomenon in many applications, including finance and genomic studies. Several statistical inference procedures have been proposed for low-dimensional coefficients in a high-dimensional linear model with homoscedastic variance, which are not applicable for models with heteroscedastic variance. The heterscedasticity issue has been rarely investigated and studied. We propose a simple inference procedure based on empirical likelihood to overcome the heteroscedasticity issue. The proposed method is able to make valid inference even when the conditional variance of random error is an unknown function of high-dimensional predictors. We apply our inference procedure to three recently proposed estimating equations and establish the asymptotic distributions of the proposed methods. Simulation studies and real data applications are conducted to demonstrate the proposed methods
Simulating games using object-oriented methodology
In this report, we present a Bridge simulator and we discuss object-oriented analysis, design and programming. The design phase uses automated support to illustrate how we apply the concepts of object-oriented methodology to develop software--a Bridge simulator. The implementation of the Bridge simulator demonstrates the programming process by using an object-oriented language (C++). Important features of the Bridge simulator are the use of the object-oriented paradigm for design and the use of the X Window/Motif toolkits to construct a user interface for simulating the hidding and the playing of the game of Bridge. We conclude with the results of the Bridge simulator, discuss a research on computer Bridge and suggest avenues for further directions in which the project could be extended
Unified empirical likelihood ratio tests for functional concurrent linear models and the phase transition from sparse to dense functional data
We consider the problem of testing functional constraints in a class of functional concurrent linear models where both the predictors and the response are functional data measured at discrete time points. We propose test procedures based on the empirical likelihood with biasâcorrected estimating equations to conduct both pointwise and simultaneous inferences. The asymptotic distributions of the test statistics are derived under the null and local alternative hypotheses, where sparse and dense functional data are considered in a unified framework. We find a phase transition in the asymptotic null distributions and the orders of detectable alternatives from sparse to dense functional data. Specifically, the tests proposed can detect alternatives of ânâorder when the number of repeated measurements per curve is of an order larger than urn:x-wiley:13697412:media:rssb12246:rssb12246-math-0001 with n being the number of curves. The transition points urn:x-wiley:13697412:media:rssb12246:rssb12246-math-0002 for pointwise and simultaneous tests are different and both are smaller than the transition point in the estimation problem. Simulation studies and real data analyses are conducted to demonstrate the methods proposed
Development of a novel immunoperoxidase monolayer assay for detection of swine Hepatitis E virus antibodies based on stable cell lines expressing the ORF3 protein
Hepatitis E virus (HEV) strains are classified into 4 genotypes by nucleotide sequencing. Genotypes 3 and 4 infect humans and animals via HEV-contaminated food or water. HEV RNA was detected by PCR and antibodies were detected by ELISA. Since human studies showed that HEV IgG antibodies in sera can persist for extended periods, diagnosis of HEV infection in swine or humans is mainly based on serological detection using commercial ELISA kits. However, there is no supplemental method to verify ELISA results. Hence, we developed a novel method used for mutual correction of these common processes. Here, a modified stable HepG2 cell line was transfected with pcDNA3.1-ORF3 to express the swine HEV ORF3 protein. Based on this cell line, a novel immunoperoxidase monolayer assay (IPMA) was developed to detect antibodies against HEV. The results show that this method has good specificity, sensitivity and repeatability. When used to investigate 141 porcine serum samples, the IPMA had a coincidence rate of 92.2% with a commercial ELISA kit. The established IPMA described herein is valuable as a supplemental method to ELISA and can differentiate infections by HEV and other viruses
The effects of short-term rainfall variability on leaf isotopic traits of desert plants in sand-binding ecosystems
Author's manuscript made available in accordance with the publisher's policy.Sand-binding vegetation is effective in stabilizing sand dunes and reducing soil erosion, thus helps minimize the detrimental effects of desertification. The aim of this study is to better understand the relationships between water and nutrient usage of sand-binding species, and the effects of succession and rainfall variability on plantsâ waterânutrient interactions. We examined the effects of long-term succession (50 years), inter-annual rainfall variability (from 65% of the mean annual precipitation in 2004 to 42% in 2005) and seasonality on waterânutrient interactions of three major sand-binding species (Artemisia ordosica, Hedysarum scoparium and Caragana korshinskii) by measuring foliar δ13C, δ15N and [N]. Long-term succession in general did not significantly alter δ13C, δ15N and [N] of the three species. Short-term rainfall variability, however, significantly increased foliar δ13C levels of all three species by 1.0â1.8â° during the severely dry year. No significant seasonal patterns were found in foliar δ13C and δ15N values of the three species, whereas foliar [N] varied by season. For the two leguminous shrubs, the correlations between δ13C and δ15N were positive in both sampling years, and the positive correlation between [N] and δ13C was only found in the severely dry year. The results indicate that these sand-binding plants have developed into a relatively stable stage and they are able to regulate their nitrogen and water use in responding to environmental conditions, which reinforces the effectiveness of plantation of native shrubs without irrigation in degraded areas. However, the results also indicate that short-term climate variability could have severe impact on the vegetation functions
Robust estimation of heterogeneous treatment effects using electronic health record data
Estimation of heterogeneous treatment effects is an essential component of precision medicine. Model and algorithm-based methods have been developed within the causal inference framework to achieve valid estimation and inference. Existing methods such as the A-learner, R-learner, modified covariates method (with and without efficiency augmentation), inverse propensity score weighting, and augmented inverse propensity score weighting have been proposed mostly under the square error loss function. The performance of these methods in the presence of data irregularity and high dimensionality, such as that encountered in electronic health record (EHR) data analysis, has been less studied. In this research, we describe a general formulation that unifies many of the existing learners through a common score function. The new formulation allows the incorporation of least absolute deviation (LAD) regression and dimension reduction techniques to counter the challenges in EHR data analysis. We show that under a set of mild regularity conditions, the resultant estimator has an asymptotic normal distribution. Within this framework, we proposed two specific estimators for EHR analysis based on weighted LAD with penalties for sparsity and smoothness simultaneously. Our simulation studies show that the proposed methods are more robust to outliers under various circumstances. We use these methods to assess the blood pressure-lowering effects of two commonly used antihypertensive therapies
Application of genetics and genomics to aquaculture development: current and future directions
Global aquaculture production continues to grow rapidly yet a small proportion of the animals and plants being used come from managed breeding and improvement programmes. The biology of aquatic organisms offer many opportunities for rapid genetic gains as new genetic and genomic techniques make the management of improvement programmes feasible in a wider range of species. The current paper describes the application of a wide range of techniques, many unique to aquatic organisms, and their potential to secure aquaculture production in the future
- âŚ