187 research outputs found

    A Provable Smoothing Approach for High Dimensional Generalized Regression with Applications in Genomics

    Get PDF
    In many applications, linear models fit the data poorly. This article studies an appealing alternative, the generalized regression model. This model only assumes that there exists an unknown monotonically increasing link function connecting the response YY to a single index XTβ∗X^T\beta^* of explanatory variables X∈RdX\in\mathbb{R}^d. The generalized regression model is flexible and covers many widely used statistical models. It fits the data generating mechanisms well in many real problems, which makes it useful in a variety of applications where regression models are regularly employed. In low dimensions, rank-based M-estimators are recommended to deal with the generalized regression model, giving root-nn consistent estimators of β∗\beta^*. Applications of these estimators to high dimensional data, however, are questionable. This article studies, both theoretically and practically, a simple yet powerful smoothing approach to handle the high dimensional generalized regression model. Theoretically, a family of smoothing functions is provided, and the amount of smoothing necessary for efficient inference is carefully calculated. Practically, our study is motivated by an important and challenging scientific problem: decoding gene regulation by predicting transcription factors that bind to cis-regulatory elements. Applying our proposed method to this problem shows substantial improvement over the state-of-the-art alternative in real data.Comment: 53 page

    Empirical Likelihood Ratio Tests for Coe cients in High Dimensional Heteroscedastic Linear Models

    Get PDF
    This paper considers hypothesis testing problems for a low-dimensional coefficient vector in a high-dimensional linear model with heteroscedastic variance. Heteroscedasticity is a commonly observed phenomenon in many applications, including finance and genomic studies. Several statistical inference procedures have been proposed for low-dimensional coefficients in a high-dimensional linear model with homoscedastic variance, which are not applicable for models with heteroscedastic variance. The heterscedasticity issue has been rarely investigated and studied. We propose a simple inference procedure based on empirical likelihood to overcome the heteroscedasticity issue. The proposed method is able to make valid inference even when the conditional variance of random error is an unknown function of high-dimensional predictors. We apply our inference procedure to three recently proposed estimating equations and establish the asymptotic distributions of the proposed methods. Simulation studies and real data applications are conducted to demonstrate the proposed methods

    Simulating games using object-oriented methodology

    Get PDF
    In this report, we present a Bridge simulator and we discuss object-oriented analysis, design and programming. The design phase uses automated support to illustrate how we apply the concepts of object-oriented methodology to develop software--a Bridge simulator. The implementation of the Bridge simulator demonstrates the programming process by using an object-oriented language (C++). Important features of the Bridge simulator are the use of the object-oriented paradigm for design and the use of the X Window/Motif toolkits to construct a user interface for simulating the hidding and the playing of the game of Bridge. We conclude with the results of the Bridge simulator, discuss a research on computer Bridge and suggest avenues for further directions in which the project could be extended

    Unified empirical likelihood ratio tests for functional concurrent linear models and the phase transition from sparse to dense functional data

    Get PDF
    We consider the problem of testing functional constraints in a class of functional concurrent linear models where both the predictors and the response are functional data measured at discrete time points. We propose test procedures based on the empirical likelihood with bias‐corrected estimating equations to conduct both pointwise and simultaneous inferences. The asymptotic distributions of the test statistics are derived under the null and local alternative hypotheses, where sparse and dense functional data are considered in a unified framework. We find a phase transition in the asymptotic null distributions and the orders of detectable alternatives from sparse to dense functional data. Specifically, the tests proposed can detect alternatives of √n‐order when the number of repeated measurements per curve is of an order larger than urn:x-wiley:13697412:media:rssb12246:rssb12246-math-0001 with n being the number of curves. The transition points urn:x-wiley:13697412:media:rssb12246:rssb12246-math-0002 for pointwise and simultaneous tests are different and both are smaller than the transition point in the estimation problem. Simulation studies and real data analyses are conducted to demonstrate the methods proposed

    Development of a novel immunoperoxidase monolayer assay for detection of swine Hepatitis E virus antibodies based on stable cell lines expressing the ORF3 protein

    Get PDF
    Hepatitis E virus (HEV) strains are classified into 4 genotypes by nucleotide sequencing. Genotypes 3 and 4 infect humans and animals via HEV-contaminated food or water. HEV RNA was detected by PCR and antibodies were detected by ELISA. Since human studies showed that HEV IgG antibodies in sera can persist for extended periods, diagnosis of HEV infection in swine or humans is mainly based on serological detection using commercial ELISA kits. However, there is no supplemental method to verify ELISA results. Hence, we developed a novel method used for mutual correction of these common processes. Here, a modified stable HepG2 cell line was transfected with pcDNA3.1-ORF3 to express the swine HEV ORF3 protein. Based on this cell line, a novel immunoperoxidase monolayer assay (IPMA) was developed to detect antibodies against HEV. The results show that this method has good specificity, sensitivity and repeatability. When used to investigate 141 porcine serum samples, the IPMA had a coincidence rate of 92.2% with a commercial ELISA kit. The established IPMA described herein is valuable as a supplemental method to ELISA and can differentiate infections by HEV and other viruses

    The effects of short-term rainfall variability on leaf isotopic traits of desert plants in sand-binding ecosystems

    Get PDF
    Author's manuscript made available in accordance with the publisher's policy.Sand-binding vegetation is effective in stabilizing sand dunes and reducing soil erosion, thus helps minimize the detrimental effects of desertification. The aim of this study is to better understand the relationships between water and nutrient usage of sand-binding species, and the effects of succession and rainfall variability on plants’ water–nutrient interactions. We examined the effects of long-term succession (50 years), inter-annual rainfall variability (from 65% of the mean annual precipitation in 2004 to 42% in 2005) and seasonality on water–nutrient interactions of three major sand-binding species (Artemisia ordosica, Hedysarum scoparium and Caragana korshinskii) by measuring foliar δ13C, δ15N and [N]. Long-term succession in general did not significantly alter δ13C, δ15N and [N] of the three species. Short-term rainfall variability, however, significantly increased foliar δ13C levels of all three species by 1.0–1.8‰ during the severely dry year. No significant seasonal patterns were found in foliar δ13C and δ15N values of the three species, whereas foliar [N] varied by season. For the two leguminous shrubs, the correlations between δ13C and δ15N were positive in both sampling years, and the positive correlation between [N] and δ13C was only found in the severely dry year. The results indicate that these sand-binding plants have developed into a relatively stable stage and they are able to regulate their nitrogen and water use in responding to environmental conditions, which reinforces the effectiveness of plantation of native shrubs without irrigation in degraded areas. However, the results also indicate that short-term climate variability could have severe impact on the vegetation functions

    Robust estimation of heterogeneous treatment effects using electronic health record data

    Get PDF
    Estimation of heterogeneous treatment effects is an essential component of precision medicine. Model and algorithm-based methods have been developed within the causal inference framework to achieve valid estimation and inference. Existing methods such as the A-learner, R-learner, modified covariates method (with and without efficiency augmentation), inverse propensity score weighting, and augmented inverse propensity score weighting have been proposed mostly under the square error loss function. The performance of these methods in the presence of data irregularity and high dimensionality, such as that encountered in electronic health record (EHR) data analysis, has been less studied. In this research, we describe a general formulation that unifies many of the existing learners through a common score function. The new formulation allows the incorporation of least absolute deviation (LAD) regression and dimension reduction techniques to counter the challenges in EHR data analysis. We show that under a set of mild regularity conditions, the resultant estimator has an asymptotic normal distribution. Within this framework, we proposed two specific estimators for EHR analysis based on weighted LAD with penalties for sparsity and smoothness simultaneously. Our simulation studies show that the proposed methods are more robust to outliers under various circumstances. We use these methods to assess the blood pressure-lowering effects of two commonly used antihypertensive therapies

    Application of genetics and genomics to aquaculture development: current and future directions

    Get PDF
    Global aquaculture production continues to grow rapidly yet a small proportion of the animals and plants being used come from managed breeding and improvement programmes. The biology of aquatic organisms offer many opportunities for rapid genetic gains as new genetic and genomic techniques make the management of improvement programmes feasible in a wider range of species. The current paper describes the application of a wide range of techniques, many unique to aquatic organisms, and their potential to secure aquaculture production in the future
    • …