Search CORE

5,547 research outputs found

Cram\'er type moderate deviation theorems for self-normalized processes

Author: Shao Qi-Man
Zhou Wen-Xin
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 06/06/2016
Field of study

Cram\'er type moderate deviation theorems quantify the accuracy of the relative error of the normal approximation and provide theoretical justifications for many commonly used methods in statistics. In this paper, we develop a new randomized concentration inequality and establish a Cram\'er type moderate deviation theorem for general self-normalized processes which include many well-known Studentized nonlinear statistics. In particular, a sharp moderate deviation theorem under optimal moment conditions is established for Studentized

U

-statistics.Comment: Published at http://dx.doi.org/10.3150/15-BEJ719 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

eScholarship - University of California

Multi-Label Learning with Label Enhancement

Author: Geng Xin
Shao Ruifeng
Xu Ning
Publication venue
Publication date: 16/04/2019
Field of study

The task of multi-label learning is to predict a set of relevant labels for the unseen instance. Traditional multi-label learning algorithms treat each class label as a logical indicator of whether the corresponding label is relevant or irrelevant to the instance, i.e., +1 represents relevant to the instance and -1 represents irrelevant to the instance. Such label represented by -1 or +1 is called logical label. Logical label cannot reflect different label importance. However, for real-world multi-label learning problems, the importance of each possible label is generally different. For the real applications, it is difficult to obtain the label importance information directly. Thus we need a method to reconstruct the essential label importance from the logical multilabel data. To solve this problem, we assume that each multi-label instance is described by a vector of latent real-valued labels, which can reflect the importance of the corresponding labels. Such label is called numerical label. The process of reconstructing the numerical labels from the logical multi-label data via utilizing the logical label information and the topological structure in the feature space is called Label Enhancement. In this paper, we propose a novel multi-label learning framework called LEMLL, i.e., Label Enhanced Multi-Label Learning, which incorporates regression of the numerical labels and label enhancement into a unified framework. Extensive comparative studies validate that the performance of multi-label learning can be improved significantly with label enhancement and LEMLL can effectively reconstruct latent label importance information from logical multi-label data.Comment: ICDM 201

arXiv.org e-Print Archive

Crossref

Evolutionary dynamics of group cooperation with asymmetrical environmental feedback

Author: Fu Feng
Shao Yanxuan
Wang Xin
Publication venue: 'IOP Publishing'
Publication date: 13/06/2019
Field of study

In recent years, there has been growing interest in studying evolutionary games with environmental feedback. Previous studies exclusively focus on two-player games. However, extension to multi-player game is needed to study problems such as microbial cooperation and crowdsourcing collaborations. Here, we study coevolutionary public goods games where strategies coevolve with the multiplication factors of group cooperation. Asymmetry can arise in such environmental feedback, where games organized by focal cooperators may have a different efficiency than the ones by defectors. Our analysis shows that co-evolutionary dynamics with asymmetrical environmental feedback can yield oscillatory convergence to persistent cooperation, if the relative changing speed of cooperators' multiplication factor is above a certain threshold. Our work provides useful insights into sustaining group cooperation in a changing world

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Freeze-in Dirac neutrinogenesis: thermal leptonic CP asymmetry

Author: Li Shao-Ping
Li Xin-Qiang
Yan Xin-Shuai
Yang Ya-Dong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/12/2020
Field of study

We present a freeze-in realization of the Dirac neutrinogenesis in which the decaying particle that generates the lepton-number asymmetry is in thermal equilibrium. As the right-handed Dirac neutrinos are produced non-thermally, the lepton-number asymmetry is accumulated and partially converted to the baryon-number asymmetry via the rapid sphaleron transitions. The necessary CP-violating condition can be fulfilled by a purely thermal kinetic phase from the wavefunction correction in the lepton-doublet sector, which has been neglected in most leptogenesis-based setup. Furthermore, this condition necessitates a preferred flavor basis in which both the charged-lepton and neutrino Yukawa matrices are non-diagonal. To protect such a proper Yukawa structure from the basis transformations in flavor space prior to the electroweak gauge symmetry breaking, we can resort to a plethora of model buildings aimed at deciphering the non-trivial Yukawa structures. Interestingly, based on the well-known tri-bimaximal mixing with a minimal correction from the charged-lepton or neutrino sector, we find that a simultaneous explanation of the baryon-number asymmetry in the Universe and the low-energy neutrino oscillation observables can be attributed to the mixing angle and the CP-violating phase introduced in the minimal correction.Comment: 28 pages and 7 figures; more discussions and one figure added, final version published in the journa

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Family, school and jobs: intergenerational social mobility in Next Steps

Author: SHAO XIN
Publication venue
Publication date: 01/01/2022
Field of study

Young people’s higher education (HE) participation, and early access to labour markets, in the UK and other developed countries, are stratified according to their socio-economic origins and prior educational attainment. Such background factors are difficult to change in an individual’s lifetime, they are presumably not the only determinants of stratified outcomes, and anyway they could be mediated by peer influence and the issue of who goes to school with whom. This new study examines the relationships between a wide range of such social and economic factors relating to birth characteristics, family background, secondary schooling characteristics, and post-16 destinations, and it explores the possible reasons behind their links to HE and labour market outcomes. At the core of the study is an innovative combination of the large-scale nationally representative longitudinal Next Steps survey dataset linked to the robust administrative National Pupil Database (NPD) for England. In order to investigate the degree of social justice and equity in education, the study tracks the life course of a cohort of 5,192 state-school-educated young people in England from age 13 to age 25, to build a comprehensive picture of the journeys of these young people entering the labour market in their early adulthood. Analytical methods used include cross-tabulations, effect sizes, correlations and regression models. The main outcomes of interest are HE participation, and labour market outcomes as indicated by employment status and professional occupation status. The findings show a complex but relatively clear picture, providing some confirmatory and some new evidence on the correlates of intergenerational social mobility in a large cohort of people who are currently in their early 30s. Disadvantaged young people are consistently under-represented in HE participation and the labour market, especially in professional occupations. Bivariate analyses show that HE opportunities and labour market outcomes are systematically unbalanced between different socio-economic groups of young people, suggesting that destinations are strongly stratified by social origins. All of the factors considered in this study are independently associated with post-16 outcomes when analysed separately. Regression models reveal that, once birth characteristics are controlled for, the most important predictor of HE entry is prior educational attainment. This is followed by parental and pupil aspirations, parental occupation and education, material ownership at home, positive schooling experiences, and geographical location. In terms of employment status, doing an apprenticeship is the most powerful predictor of being employed at age 25 (although this may be skewed by the small number of young people still in formal education at that age). This is followed by prior educational attainment, material ownership at home, and prior HE entry. The relationship between the predictors and having a professional occupation status is slightly different. Regression analysis demonstrates that the key predictors of having a professional job are prior educational attainment, HE participation, parental and pupil aspirations, and positive schooling experiences. However, unlike generic employment status, evidence shows that having done an apprenticeship does not contribute to higher chances of landing a professional job. These findings collectively offer a core message in terms of fair access to life opportunities; the most import barriers to access to HE and professional occupations are stratified prior educational attainment and poverty-related factors at home. More crucially, the study also makes the first attempt to explore the level of segregation by background characteristics that is experienced at school as a potential factor in intergenerational social mobility. It is, to our knowledge, the only study to date which examines whether and to what extent who goes to school with whom might play a role in these outcomes beyond school. Bivariate analyses show that the clustering of pupils of similarly poorer socio-economic backgrounds at school is consistently linked to lower chances of HE participation and poorer labour market outcomes. Regression analyses further suggest that the level of between-school segregation an individual experiences plays a small role in all post-16 pathways, over and above that which can be explained by individual factors. In the light of these results, it appears that life destinations are still patterned by background inequality in modern England. However, there are promising signs that policy interventions – including creating a more socially mixed school intake, providing more financial support for low-income families such as travel bursaries, continuing and improving contextualised assessment in both university admissions and recruitment processes, and investing more in public transport in deprived areas – can help to improve fair access to HE and the labour market. These interventions can bring other long-term benefits such as life satisfaction too. Perhaps, instead of advocating or focusing on promoting social mobility, policymakers should devote more energy to and invest more money in tackling social inequality and improving equity in education and life opportunities. If this were to be done effectively, then social mobility could, presumably, look after itself

Durham e-Theses

Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications

Author: Fan Jianqing
Shao Qi-Man
Zhou Wen-Xin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/07/2017
Field of study

Over the last two decades, many exciting variable selection methods have been developed for finding a small group of covariates that are associated with the response from a large pool. Can the discoveries from these data mining approaches be spurious due to high dimensionality and limited sample size? Can our fundamental assumptions about the exogeneity of the covariates needed for such variable selection be validated with the data? To answer these questions, we need to derive the distributions of the maximum spurious correlations given a certain number of predictors, namely, the distribution of the correlation of a response variable

Y

with the best

s

linear combinations of

p

covariates

\mathbf{X}

, even when

\mathbf{X}

and

Y

are independent. When the covariance matrix of

\mathbf{X}

possesses the restricted eigenvalue property, we derive such distributions for both a finite

s

and a diverging

s

, using Gaussian approximation and empirical process techniques. However, such a distribution depends on the unknown covariance matrix of

\mathbf{X}

. Hence, we use the multiplier bootstrap procedure to approximate the unknown distributions and establish the consistency of such a simple bootstrap approach. The results are further extended to the situation where the residuals are from regularized fits. Our approach is then used to construct the upper confidence limit for the maximum spurious correlation and to test the exogeneity of the covariates. The former provides a baseline for guarding against false discoveries and the latter tests whether our fundamental assumptions for high-dimensional model selection are statistically valid. Our techniques and results are illustrated with both numerical examples and real data analysis

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

eScholarship - University of California

Cram\'{e}r-type moderate deviations for Studentized two-sample $U$ -statistics with applications

Author: Chang Jinyuan
Shao Qi-Man
Zhou Wen-Xin
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 28/09/2016
Field of study

Two-sample

U

-statistics are widely used in a broad range of applications, including those in the fields of biostatistics and econometrics. In this paper, we establish sharp Cram\'{e}r-type moderate deviation theorems for Studentized two-sample

U

-statistics in a general framework, including the two-sample

t

-statistic and Studentized Mann-Whitney test statistic as prototypical examples. In particular, a refined moderate deviation theorem with second-order accuracy is established for the two-sample

t

-statistic. These results extend the applicability of the existing statistical methodologies from the one-sample

t

-statistic to more general nonlinear statistics. Applications to two-sample large-scale multiple testing problems with false discovery rate control and the regularized bootstrap method are also discussed.Comment: Published at http://dx.doi.org/10.1214/15-AOS1375 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

eScholarship - University of California