9,993 research outputs found
Gene ranking and biomarker discovery under correlation
Biomarker discovery and gene ranking is a standard task in genomic high
throughput analysis. Typically, the ordering of markers is based on a
stabilized variant of the t-score, such as the moderated t or the SAM
statistic. However, these procedures ignore gene-gene correlations, which may
have a profound impact on the gene orderings and on the power of the subsequent
tests.
We propose a simple procedure that adjusts gene-wise t-statistics to take
account of correlations among genes. The resulting correlation-adjusted
t-scores ("cat" scores) are derived from a predictive perspective, i.e. as a
score for variable selection to discriminate group membership in two-class
linear discriminant analysis. In the absence of correlation the cat score
reduces to the standard t-score. Moreover, using the cat score it is
straightforward to evaluate groups of features (i.e. gene sets). For
computation of the cat score from small sample data we propose a shrinkage
procedure. In a comparative study comprising six different synthetic and
empirical correlation structures we show that the cat score improves estimation
of gene orderings and leads to higher power for fixed true discovery rate, and
vice versa. Finally, we also illustrate the cat score by analyzing metabolomic
data.
The shrinkage cat score is implemented in the R package "st" available from
URL http://cran.r-project.org/web/packages/st/Comment: 18 pages, 5 figures, 1 tabl
On Two Simple and Effective Procedures for High Dimensional Classification of General Populations
In this paper, we generalize two criteria, the determinant-based and
trace-based criteria proposed by Saranadasa (1993), to general populations for
high dimensional classification. These two criteria compare some distances
between a new observation and several different known groups. The
determinant-based criterion performs well for correlated variables by
integrating the covariance structure and is competitive to many other existing
rules. The criterion however requires the measurement dimension be smaller than
the sample size. The trace-based criterion in contrast, is an independence rule
and effective in the "large dimension-small sample size" scenario. An appealing
property of these two criteria is that their implementation is straightforward
and there is no need for preliminary variable selection or use of turning
parameters. Their asymptotic misclassification probabilities are derived using
the theory of large dimensional random matrices. Their competitive performances
are illustrated by intensive Monte Carlo experiments and a real data analysis.Comment: 5 figures; 22 pages. To appear in "Statistical Papers
Localized Regression
The main problem with localized discriminant techniques is the curse of dimensionality, which seems to restrict their use to the case of few variables. This restriction does not hold if localization is combined with a reduction of dimension. In particular it is shown that localization yields powerful classifiers even in higher dimensions if localization is combined with locally adaptive selection of predictors. A robust localized logistic regression (LLR) method is developed for which all tuning parameters are chosen dataÂĄadaptively. In an extended simulation study we evaluate the potential of the proposed procedure for various types of data and compare it to other classification procedures. In addition we demonstrate that automatic choice of localization, predictor selection and penalty parameters based on cross validation is working well. Finally the method is applied to real data sets and its real world performance is compared to alternative procedures
Alliance block composition patterns in the microelectronics industry
In this note we examine whether a position in a technology alliance block is accessible to everyone. It appears that partners are selected on the basis of distinctive attributes they have, which can inhibit outsiders to join these alliance groups. Our findings clearly indicate that alliance blocks are composed of actors that have rather similar characteristics. The social selection processes that alliance block members employ vis-a-vis non-block members can create a source of competitive advantage in terms of a higher innovative performance. Empirical research is focused on the international microelectronics industry.strategic technology alliances, alliance block membership strategy, microelectronics industry, group-based competition
Maximized Posteriori Attributes Selection from Facial Salient Landmarks for Face Recognition
This paper presents a robust and dynamic face recognition technique based on
the extraction and matching of devised probabilistic graphs drawn on SIFT
features related to independent face areas. The face matching strategy is based
on matching individual salient facial graph characterized by SIFT features as
connected to facial landmarks such as the eyes and the mouth. In order to
reduce the face matching errors, the Dempster-Shafer decision theory is applied
to fuse the individual matching scores obtained from each pair of salient
facial features. The proposed algorithm is evaluated with the ORL and the IITK
face databases. The experimental results demonstrate the effectiveness and
potential of the proposed face recognition technique also in case of partially
occluded faces.Comment: 8 pages, 2 figure
Detecting single-trial EEG evoked potential using a wavelet domain linear mixed model: application to error potentials classification
Objective. The main goal of this work is to develop a model for multi-sensor
signals such as MEG or EEG signals, that accounts for the inter-trial
variability, suitable for corresponding binary classification problems. An
important constraint is that the model be simple enough to handle small size
and unbalanced datasets, as often encountered in BCI type experiments.
Approach. The method involves linear mixed effects statistical model, wavelet
transform and spatial filtering, and aims at the characterization of localized
discriminant features in multi-sensor signals. After discrete wavelet transform
and spatial filtering, a projection onto the relevant wavelet and spatial
channels subspaces is used for dimension reduction. The projected signals are
then decomposed as the sum of a signal of interest (i.e. discriminant) and
background noise, using a very simple Gaussian linear mixed model. Main
results. Thanks to the simplicity of the model, the corresponding parameter
estimation problem is simplified. Robust estimates of class-covariance matrices
are obtained from small sample sizes and an effective Bayes plug-in classifier
is derived. The approach is applied to the detection of error potentials in
multichannel EEG data, in a very unbalanced situation (detection of rare
events). Classification results prove the relevance of the proposed approach in
such a context. Significance. The combination of linear mixed model, wavelet
transform and spatial filtering for EEG classification is, to the best of our
knowledge, an original approach, which is proven to be effective. This paper
improves on earlier results on similar problems, and the three main ingredients
all play an important role
Beyond Gauss: Image-Set Matching on the Riemannian Manifold of PDFs
State-of-the-art image-set matching techniques typically implicitly model
each image-set with a Gaussian distribution. Here, we propose to go beyond
these representations and model image-sets as probability distribution
functions (PDFs) using kernel density estimators. To compare and match
image-sets, we exploit Csiszar f-divergences, which bear strong connections to
the geodesic distance defined on the space of PDFs, i.e., the statistical
manifold. Furthermore, we introduce valid positive definite kernels on the
statistical manifolds, which let us make use of more powerful classification
schemes to match image-sets. Finally, we introduce a supervised dimensionality
reduction technique that learns a latent space where f-divergences reflect the
class labels of the data. Our experiments on diverse problems, such as
video-based face recognition and dynamic texture classification, evidence the
benefits of our approach over the state-of-the-art image-set matching methods
Classification of chirp signals using hierarchical bayesian learning and MCMC methods
This paper addresses the problem of classifying chirp signals using hierarchical Bayesian learning together with Markov chain Monte Carlo (MCMC) methods. Bayesian learning consists of estimating the distribution of the observed data conditional on each class from a set of training samples. Unfortunately, this estimation requires to evaluate intractable multidimensional integrals. This paper studies an original implementation of hierarchical Bayesian learning that estimates the class conditional probability densities using MCMC methods. The performance of this implementation is first studied via an academic example for which the class conditional densities are known. The problem of classifying chirp signals is then addressed by using a similar hierarchical Bayesian learning implementation based on a Metropolis-within-Gibbs algorithm
Financial crises and bank failures: a review of prediction methods
In this article we analyze financial and economic circumstances associated with the U.S. subprime mortgage crisis and the global financial turmoil that has led to severe crises in many countries. We suggest that the level of cross-border holdings of long-term securities between the United States and the rest of the world may indicate a direct link between the turmoil in the securitized market originated in the United States and that in other countries. We provide a summary of empirical results obtained in several Economics and Operations Research papers that attempt to explain, predict, or suggest remedies for financial crises or banking defaults; we also extensively outline the methodologies used in them. The intent of this article is to promote future empirical research for preventing financial crises.Subprime mortgage ; Financial crises
- âŠ