19,836 research outputs found

    Asymptotic properties of eigenmatrices of a large sample covariance matrix

    Full text link
    Let Sn=1nXnXnS_n=\frac{1}{n}X_nX_n^* where Xn={Xij}X_n=\{X_{ij}\} is a p×np\times n matrix with i.i.d. complex standardized entries having finite fourth moments. Let Yn(t1,t2,σ)=p(xn(t1)(Sn+σI)1xn(t2)xn(t1)xn(t2)mn(σ))Y_n(\mathbf {t}_1,\mathbf {t}_2,\sigma)=\sqrt{p}({\mathbf {x}}_n(\mathbf {t}_1)^*(S_n+\sigma I)^{-1}{\mathbf {x}}_n(\mathbf {t}_2)-{\mathbf {x}}_n(\mathbf {t}_1)^*{\mathbf {x}}_n(\mathbf {t}_2)m_n(\sigma)) in which σ>0\sigma>0 and mn(σ)=dFyn(x)x+σm_n(\sigma)=\int\frac{dF_{y_n}(x)}{x+\sigma} where Fyn(x)F_{y_n}(x) is the Mar\v{c}enko--Pastur law with parameter yn=p/ny_n=p/n; which converges to a positive constant as nn\to\infty, and xn(t1){\mathbf {x}}_n(\mathbf {t}_1) and xn(t2){\mathbf {x}}_n(\mathbf {t}_2) are unit vectors in Cp{\Bbb{C}}^p, having indices t1\mathbf {t}_1 and t2\mathbf {t}_2, ranging in a compact subset of a finite-dimensional Euclidean space. In this paper, we prove that the sequence Yn(t1,t2,σ)Y_n(\mathbf {t}_1,\mathbf {t}_2,\sigma) converges weakly to a (2m+1)(2m+1)-dimensional Gaussian process. This result provides further evidence in support of the conjecture that the distribution of the eigenmatrix of SnS_n is asymptotically close to that of a Haar-distributed unitary matrix.Comment: Published in at http://dx.doi.org/10.1214/10-AAP748 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Learning associations between clinical information and motion-based descriptors using a large scale MR-derived cardiac motion atlas

    Full text link
    The availability of large scale databases containing imaging and non-imaging data, such as the UK Biobank, represents an opportunity to improve our understanding of healthy and diseased bodily function. Cardiac motion atlases provide a space of reference in which the motion fields of a cohort of subjects can be directly compared. In this work, a cardiac motion atlas is built from cine MR data from the UK Biobank (~ 6000 subjects). Two automated quality control strategies are proposed to reject subjects with insufficient image quality. Based on the atlas, three dimensionality reduction algorithms are evaluated to learn data-driven cardiac motion descriptors, and statistical methods used to study the association between these descriptors and non-imaging data. Results show a positive correlation between the atlas motion descriptors and body fat percentage, basal metabolic rate, hypertension, smoking status and alcohol intake frequency. The proposed method outperforms the ability to identify changes in cardiac function due to these known cardiovascular risk factors compared to ejection fraction, the most commonly used descriptor of cardiac function. In conclusion, this work represents a framework for further investigation of the factors influencing cardiac health.Comment: 2018 International Workshop on Statistical Atlases and Computational Modeling of the Hear

    Scanning tunneling microscopy investigation of 2H-MoS_2: A layered semiconducting transition‐metal dichalcogenide

    Get PDF
    Scanning tunneling microscopy (STM) has been enormously successful in solving several important problems in the geometric and electronic structure of homogeneous metallic and semiconducting surfaces. A central question which remains to be answered with respect to the study of compound surfaces, however, is the extent to which the chemical identity of constituent atoms may be established. Recently, progress in this area was made by Feenstra et al. who succeeded in selectively imaging either Ga or As atoms on the GaAs (110) surface. So far this is the only case where such selectivity has been achieved. In an effort to add to our understanding of compound surface imaging we have undertaken a vacuum STM study of 2H-MoS_2, a material which has two structurally and electronically different atomic species at its surface

    Testing linear hypotheses in high-dimensional regressions

    Full text link
    For a multivariate linear model, Wilk's likelihood ratio test (LRT) constitutes one of the cornerstone tools. However, the computation of its quantiles under the null or the alternative requires complex analytic approximations and more importantly, these distributional approximations are feasible only for moderate dimension of the dependent variable, say p20p\le 20. On the other hand, assuming that the data dimension pp as well as the number qq of regression variables are fixed while the sample size nn grows, several asymptotic approximations are proposed in the literature for Wilk's \bLa including the widely used chi-square approximation. In this paper, we consider necessary modifications to Wilk's test in a high-dimensional context, specifically assuming a high data dimension pp and a large sample size nn. Based on recent random matrix theory, the correction we propose to Wilk's test is asymptotically Gaussian under the null and simulations demonstrate that the corrected LRT has very satisfactory size and power, surely in the large pp and large nn context, but also for moderately large data dimensions like p=30p=30 or p=50p=50. As a byproduct, we give a reason explaining why the standard chi-square approximation fails for high-dimensional data. We also introduce a new procedure for the classical multiple sample significance test in MANOVA which is valid for high-dimensional data.Comment: Accepted 02/2012 for publication in "Statistics". 20 pages, 2 pages and 2 table

    Stratified decision forests for accurate anatomical landmark localization in cardiac images

    Get PDF
    Accurate localization of anatomical landmarks is an important step in medical imaging, as it provides useful prior information for subsequent image analysis and acquisition methods. It is particularly useful for initialization of automatic image analysis tools (e.g. segmentation and registration) and detection of scan planes for automated image acquisition. Landmark localization has been commonly performed using learning based approaches, such as classifier and/or regressor models. However, trained models may not generalize well in heterogeneous datasets when the images contain large differences due to size, pose and shape variations of organs. To learn more data-adaptive and patient specific models, we propose a novel stratification based training model, and demonstrate its use in a decision forest. The proposed approach does not require any additional training information compared to the standard model training procedure and can be easily integrated into any decision tree framework. The proposed method is evaluated on 1080 3D highresolution and 90 multi-stack 2D cardiac cine MR images. The experiments show that the proposed method achieves state-of-theart landmark localization accuracy and outperforms standard regression and classification based approaches. Additionally, the proposed method is used in a multi-atlas segmentation to create a fully automatic segmentation pipeline, and the results show that it achieves state-of-the-art segmentation accuracy
    corecore