15 research outputs found

    Model-Free Feature Screening for Ultrahigh Dimensional Discriminant Analysis

    No full text
    <div><p>This work is concerned with marginal sure independence feature screening for ultrahigh dimensional discriminant analysis. The response variable is categorical in discriminant analysis. This enables us to use the conditional distribution function to construct a new index for feature screening. In this article, we propose a marginal feature screening procedure based on empirical conditional distribution function. We establish the sure screening and ranking consistency properties for the proposed procedure without assuming any moment condition on the predictors. The proposed procedure enjoys several appealing merits. First, it is model-free in that its implementation does not require specification of a regression model. Second, it is robust to heavy-tailed distributions of predictors and the presence of potential outliers. Third, it allows the categorical response having a diverging number of classes in the order of <i>O</i>(<i>n</i><sup>κ</sup>) with some κ ⩾ 0. We assess the finite sample property of the proposed procedure by Monte Carlo simulation studies and numerical comparison. We further illustrate the proposed methodology by empirical analyses of two real-life datasets. Supplementary materials for this article are available online.</p></div

    The R workspace of bimj202200089.R2

    No full text
    Supplementary information / reproducible research files for the manuscript  Title:  "Model-free conditional screening for ultrahigh-dimensional survival data via conditional distance correlation" Authors:  Hengjian Cui, Yanyan Liu, Guangcai Mao and Jing Zhang Introduction:  It contains all R workspace that can be used to reproduce all results and figures of the manuscript.</p

    Bar chart of the influenza virus subtypes in the southern area.

    No full text
    <p>Different colors represents different influenza subtypes as is listed in the head.</p

    Geographic Divisions and Modeling of Virological Data on Seasonal Influenza in the Chinese Mainland during the 2006–2009 Monitoring Years

    Get PDF
    <div><p>Background</p><p>Seasonal influenza epidemics occur annually with bimodality in southern China and unimodality in northern China. Regional differences exist in surveillance data collected by the National Influenza Surveillance Network of the Chinese mainland. Qualitative and quantitative analyses on the spatiotemporal rules of the influenza virus's activities are needed to lay the foundation for the surveillance, prevention and control of seasonal influenza.</p> <p>Methods</p><p>The peak performance analysis and Fourier harmonic extraction methods were used to explore the spatiotemporal characteristics of the seasonal influenza virus activity and to obtain geographic divisions. In the first method, the concept of quality control was introduced and robust estimators were chosen to make the results more convincing. The dominant Fourier harmonics of the provincial time series were extracted in the second method, and the VARiable CLUSter (VARCLUS) procedure was used to variably cluster the extracted results. On the basis of the above geographic division results, three typical districts were selected and corresponding sinusoidal models were applied to fit the time series of the virological data.</p> <p>Results</p><p>The predominant virus during every peak is visible from the bar charts of the virological data. The results of the two methods that were used to obtain the geographic divisions have some consistencies with each other and with the virus activity mechanism. Quantitative models were established for three typical districts: the south1 district, including Guangdong, Guangxi, Jiangxi and Fujian; the south2 district, including Hunan, Hubei, Shanghai, Jiangsu and Zhejiang; and the north district, including the 14 northern provinces except Qinghai. The sinusoidal fitting models showed that the south1 district had strong annual periodicity with strong winter peaks and weak summer peaks. The south2 district had strong semi-annual periodicity with similarly strong summer and winter peaks, and the north district had strong annual periodicity with only winter peaks.</p> </div

    The comparison between raw data and sinusoidal model fit results for north district.

    No full text
    <p>The black line is the time series of raw data, while the red line is the sinusoidal fitting curves.</p

    Bar chart of the influenza virus subtypes in the northern area.

    No full text
    <p>Different colors represents different influenza subtypes as is listed in the head.</p
    corecore