Search CORE

46 research outputs found

Better subset regression

Author: Xiong Shifeng
Publication venue
Publication date: 18/03/2013
Field of study

To find efficient screening methods for high dimensional linear regression models, this paper studies the relationship between model fitting and screening performance. Under a sparsity assumption, we show that a subset that includes the true submodel always yields smaller residual sum of squares (i.e., has better model fitting) than all that do not in a general asymptotic setting. This indicates that, for screening important variables, we could follow a "better fitting, better screening" rule, i.e., pick a "better" subset that has better model fitting. To seek such a better subset, we consider the optimization problem associated with best subset regression. An EM algorithm, called orthogonalizing subset screening, and its accelerating version are proposed for searching for the best subset. Although the two algorithms cannot guarantee that a subset they yield is the best, their monotonicity property makes the subset have better model fitting than initial subsets generated by popular screening methods, and thus the subset can have better screening performance asymptotically. Simulation results show that our methods are very competitive in high dimensional variable screening even for finite sample sizes.Comment: 24 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Unweighted estimation based on optimal sample under measurement constraints

Author: Wang HaiYing
Wang Jing
Xiong Shifeng
Publication venue
Publication date: 08/10/2022
Field of study

To tackle massive data, subsampling is a practical approach to select the more informative data points. However, when responses are expensive to measure, developing efficient subsampling schemes is challenging, and an optimal sampling approach under measurement constraints was developed to meet this challenge. This method uses the inverses of optimal sampling probabilities to reweight the objective function, which assigns smaller weights to the more important data points. Thus the estimation efficiency of the resulting estimator can be improved. In this paper, we propose an unweighted estimating procedure based on optimal subsamples to obtain a more efficient estimator. We obtain the unconditional asymptotic distribution of the estimator via martingale techniques without conditioning on the pilot estimate, which has been less investigated in the existing subsampling literature. Both asymptotic results and numerical results show that the unweighted estimator is more efficient in parameter estimation

arXiv.org e-Print Archive

CRISPR/Cas9‐mediated restoration of Tamyb10 to create pre‐harvest sprouting‐resistant red wheat

Author: Cheng Shifeng
Fan Yujin
Goodrich Justin
He Yuhan
Li Pengfeng
Lin Yarong
Wang Feng
Wang Ke
Wang Yiwei
Xiong Jiang
Ye Xingguo
Zhang Cui‐Jun
Zhu Jian‐Kang
Zhu Yiwang
Publication venue: 'Wiley'
Publication date: 18/12/2022
Field of study

Edinburgh Research Explorer