16,437 research outputs found
An Approach to Variable Aggregation in Efficiency Analysis
In the nonparametric framework of Data Envelopment Analysis the statistical properties of its
estimators have been investigated and only asymptotic results are available. For DEA estimators results of
practical use have been proved only for the case of one input and one output. However, in the real world
problems the production process is usually well described by many variables. In this paper a machine learning
approach to variable aggregation based on Canonical Correlation Analysis is presented. This approach is applied
for efficiency estimation of all the farms in Terceira Island of the Azorean archipelago
Adaptive kernel canonical correlation analysis algorithms for nonparametric identification of Wiener and Hammerstein systems
This paper treats the identification of nonlinear systems that consist of a cascade of a linear channel and a nonlinearity, such as the well-known Wiener and Hammerstein systems. In particular, we follow a supervised identification approach that simultaneously identifies both parts of the nonlinear system. Given the correct restrictions on the identification problem, we show how kernel canonical correlation analysis (KCCA) emerges as the logical solution to this problem.We then extend the proposed identification algorithm to an adaptive version allowing to deal with time-varying systems. In order to avoid overfitting problems, we discuss and compare three possible regularization techniques for both the batch and the adaptive versions of the proposed algorithm. Simulations are included to demonstrate the effectiveness of the presented algorithm
Gaussian process single-index models as emulators for computer experiments
A single-index model (SIM) provides for parsimonious multi-dimensional
nonlinear regression by combining parametric (linear) projection with
univariate nonparametric (non-linear) regression models. We show that a
particular Gaussian process (GP) formulation is simple to work with and ideal
as an emulator for some types of computer experiment as it can outperform the
canonical separable GP regression model commonly used in this setting. Our
contribution focuses on drastically simplifying, re-interpreting, and then
generalizing a recently proposed fully Bayesian GP-SIM combination, and then
illustrating its favorable performance on synthetic data and a real-data
computer experiment. Two R packages, both released on CRAN, have been augmented
to facilitate inference under our proposed model(s).Comment: 23 pages, 9 figures, 1 tabl
Cancer gene prioritization by integrative analysis of mRNA expression and DNA copy number data: a comparative review
A variety of genome-wide profiling techniques are available to probe
complementary aspects of genome structure and function. Integrative analysis of
heterogeneous data sources can reveal higher-level interactions that cannot be
detected based on individual observations. A standard integration task in
cancer studies is to identify altered genomic regions that induce changes in
the expression of the associated genes based on joint analysis of genome-wide
gene expression and copy number profiling measurements. In this review, we
provide a comparison among various modeling procedures for integrating
genome-wide profiling data of gene copy number and transcriptional alterations
and highlight common approaches to genomic data integration. A transparent
benchmarking procedure is introduced to quantitatively compare the cancer gene
prioritization performance of the alternative methods. The benchmarking
algorithms and data sets are available at http://intcomp.r-forge.r-project.orgComment: PDF file including supplementary material. 9 pages. Preprin
A Nonparametric Bayesian Approach to Copula Estimation
We propose a novel Dirichlet-based P\'olya tree (D-P tree) prior on the
copula and based on the D-P tree prior, a nonparametric Bayesian inference
procedure. Through theoretical analysis and simulations, we are able to show
that the flexibility of the D-P tree prior ensures its consistency in copula
estimation, thus able to detect more subtle and complex copula structures than
earlier nonparametric Bayesian models, such as a Gaussian copula mixture.
Further, the continuity of the imposed D-P tree prior leads to a more favorable
smoothing effect in copula estimation over classic frequentist methods,
especially with small sets of observations. We also apply our method to the
copula prediction between the S\&P 500 index and the IBM stock prices during
the 2007-08 financial crisis, finding that D-P tree-based methods enjoy strong
robustness and flexibility over classic methods under such irregular market
behaviors
- …