247 research outputs found
Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering
Comparing large covariance matrices has important applications in modern
genomics, where scientists are often interested in understanding whether
relationships (e.g., dependencies or co-regulations) among a large number of
genes vary between different biological states. We propose a computationally
fast procedure for testing the equality of two large covariance matrices when
the dimensions of the covariance matrices are much larger than the sample
sizes. A distinguishing feature of the new procedure is that it imposes no
structural assumptions on the unknown covariance matrices. Hence the test is
robust with respect to various complex dependence structures that frequently
arise in genomics. We prove that the proposed procedure is asymptotically valid
under weak moment conditions. As an interesting application, we derive a new
gene clustering algorithm which shares the same nice property of avoiding
restrictive structural assumptions for high-dimensional genomics data. Using an
asthma gene expression dataset, we illustrate how the new test helps compare
the covariance matrices of the genes across different gene sets/pathways
between the disease group and the control group, and how the gene clustering
algorithm provides new insights on the way gene clustering patterns differ
between the two groups. The proposed methods have been implemented in an
R-package HDtest and is available on CRAN.Comment: The original title dated back to May 2015 is "Bootstrap Tests on High
Dimensional Covariance Matrices with Applications to Understanding Gene
Clustering
Simulation-Based Hypothesis Testing of High Dimensional Means Under Covariance Heterogeneity
In this paper, we study the problem of testing the mean vectors of high
dimensional data in both one-sample and two-sample cases. The proposed testing
procedures employ maximum-type statistics and the parametric bootstrap
techniques to compute the critical values. Different from the existing tests
that heavily rely on the structural conditions on the unknown covariance
matrices, the proposed tests allow general covariance structures of the data
and therefore enjoy wide scope of applicability in practice. To enhance powers
of the tests against sparse alternatives, we further propose two-step
procedures with a preliminary feature screening step. Theoretical properties of
the proposed tests are investigated. Through extensive numerical experiments on
synthetic datasets and an human acute lymphoblastic leukemia gene expression
dataset, we illustrate the performance of the new tests and how they may
provide assistance on detecting disease-associated gene-sets. The proposed
methods have been implemented in an R-package HDtest and are available on CRAN.Comment: 34 pages, 10 figures; Accepted for biometric
Cram\'{e}r-type moderate deviations for Studentized two-sample -statistics with applications
Two-sample -statistics are widely used in a broad range of applications,
including those in the fields of biostatistics and econometrics. In this paper,
we establish sharp Cram\'{e}r-type moderate deviation theorems for Studentized
two-sample -statistics in a general framework, including the two-sample
-statistic and Studentized Mann-Whitney test statistic as prototypical
examples. In particular, a refined moderate deviation theorem with second-order
accuracy is established for the two-sample -statistic. These results extend
the applicability of the existing statistical methodologies from the one-sample
-statistic to more general nonlinear statistics. Applications to two-sample
large-scale multiple testing problems with false discovery rate control and the
regularized bootstrap method are also discussed.Comment: Published at http://dx.doi.org/10.1214/15-AOS1375 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Web3D learning framework for 3D shape retrieval based on hybrid convolutional neural networks
With the rapid development of Web3D technologies, sketch-based model retrieval has become an increasingly important challenge, while the application of Virtual Reality and 3D technologies has made shape retrieval of furniture over a web browser feasible. In this paper, we propose a learning framework for shape retrieval based on two Siamese VGG-16 Convolutional Neural Networks (CNNs), and a CNN-based hybrid learning algorithm to select the best view for a shape. In this algorithm, the AlexNet and VGG-16 CNN architectures are used to perform classification tasks and to extract features, respectively. In addition, a feature fusion method is used to measure the similarity relation of the output features from the two Siamese networks. The proposed framework can provide new alternatives for furniture retrieval in the Web3D environment. The primary innovation is in the employment of deep learning methods to solve the challenge of obtaining the best view of 3D furniture, and to address cross-domain feature learning problems. We conduct an experiment to verify the feasibility of the framework and the results show our approach to be superior in comparison to many mainstream state-of-the-art approaches
Infrared and visible image fusion based on residual dense network and gradient loss
Deep learning has made great progress in the field of image fusion. Compared with traditional methods, the image fusion approach based on deep learning requires no cumbersome matrix operations. In this paper, an end-to-end model for the infrared and visible image fusion is proposed. This unsupervised learning network architecture do not employ fusion strategy. In the stage of feature extraction, residual dense blocks are used to generate a fusion image, which preserves the information of source images to the greatest extent. In the model of feature reconstruction, shallow feature maps, residual dense information, and deep feature maps are merged in order to build a fused result. Gradient loss that we proposed for the network can cooperate well with special weight blocks extracted from input images to more clearly express texture details in fused images. In the training phase, we select 20 source image pairs with obvious characteristics from the TNO dataset, and expand them by random tailoring to serve as the training dataset of the network. Subjective qualitative and objective quantitative results show that the proposed model has advantages over state-of-the-art methods in the tasks of infrared and visible image fusion. We also use the RoadScene dataset to do ablation experiments to verify the effectiveness of the proposed network for infrared and visible image fusion.<br/
GeSeNet: A General Semantic-guided Network with Couple Mask Ensemble for Medical Image Fusion
Testing for high-dimensional white noise using maximum cross-correlations
We propose a new omnibus test for vector white noise using the maximum absolute autocorrelations and cross-correlations of the component series. Based on an approximation by the L∞-norm of a normal random vector, the critical value of the test can be evaluated by bootstrapping from a multivariate normal distribution. In contrast to the conventional white noise test, the new method is proved to be valid for testing departure from white noise that is not independent and identically distributed. We illustrate the accuracy and the power of the proposed test by simulation, which also shows that the new test outperforms several commonly used methods, including the Lagrange multiplier test and the multivariate Box–Pierce portmanteau tests, especially when the dimension of the time series is high in relation to the sample size. The numerical results also indicate that the performance of the new test can be further enhanced when it is applied to pre-transformed data obtained via the time series principal component analysis proposed by J. Chang, B. Guo and Q. Yao (arXiv:1410.2323). The proposed procedures have been implemented in an R package
- …