58,179 research outputs found
Spanning Tests for Markowitz Stochastic Dominance
We derive properties of the cdf of random variables defined as saddle-type
points of real valued continuous stochastic processes. This facilitates the
derivation of the first-order asymptotic properties of tests for stochastic
spanning given some stochastic dominance relation. We define the concept of
Markowitz stochastic dominance spanning, and develop an analytical
representation of the spanning property. We construct a non-parametric test for
spanning based on subsampling, and derive its asymptotic exactness and
consistency. The spanning methodology determines whether introducing new
securities or relaxing investment constraints improves the investment
opportunity set of investors driven by Markowitz stochastic dominance. In an
application to standard data sets of historical stock market returns, we reject
market portfolio Markowitz efficiency as well as two-fund separation. Hence, we
find evidence that equity management through base assets can outperform the
market, for investors with Markowitz type preferences
Feature Selection for Functional Data
In this paper we address the problem of feature selection when the data is
functional, we study several statistical procedures including classification,
regression and principal components. One advantage of the blinding procedure is
that it is very flexible since the features are defined by a set of functions,
relevant to the problem being studied, proposed by the user. Our method is
consistent under a set of quite general assumptions, and produces good results
with the real data examples that we analyze.Comment: 22 pages, 4 figure
Forecasting day-ahead electricity prices in Europe: the importance of considering market integration
Motivated by the increasing integration among electricity markets, in this
paper we propose two different methods to incorporate market integration in
electricity price forecasting and to improve the predictive performance. First,
we propose a deep neural network that considers features from connected markets
to improve the predictive accuracy in a local market. To measure the importance
of these features, we propose a novel feature selection algorithm that, by
using Bayesian optimization and functional analysis of variance, evaluates the
effect of the features on the algorithm performance. In addition, using market
integration, we propose a second model that, by simultaneously predicting
prices from two markets, improves the forecasting accuracy even further. As a
case study, we consider the electricity market in Belgium and the improvements
in forecasting accuracy when using various French electricity features. We show
that the two proposed models lead to improvements that are statistically
significant. Particularly, due to market integration, the predictive accuracy
is improved from 15.7% to 12.5% sMAPE (symmetric mean absolute percentage
error). In addition, we show that the proposed feature selection algorithm is
able to perform a correct assessment, i.e. to discard the irrelevant features
Statistical methods of SNP data analysis with applications
Various statistical methods important for genetic analysis are considered and
developed. Namely, we concentrate on the multifactor dimensionality reduction,
logic regression, random forests and stochastic gradient boosting. These
methods and their new modifications, e.g., the MDR method with "independent
rule", are used to study the risk of complex diseases such as cardiovascular
ones. The roles of certain combinations of single nucleotide polymorphisms and
external risk factors are examined. To perform the data analysis concerning the
ischemic heart disease and myocardial infarction the supercomputer SKIF
"Chebyshev" of the Lomonosov Moscow State University was employed
Targeted Undersmoothing
This paper proposes a post-model selection inference procedure, called
targeted undersmoothing, designed to construct uniformly valid confidence sets
for a broad class of functionals of sparse high-dimensional statistical models.
These include dense functionals, which may potentially depend on all elements
of an unknown high-dimensional parameter. The proposed confidence sets are
based on an initially selected model and two additionally selected models, an
upper model and a lower model, which enlarge the initially selected model. We
illustrate application of the procedure in two empirical examples. The first
example considers estimation of heterogeneous treatment effects using data from
the Job Training Partnership Act of 1982, and the second example looks at
estimating profitability from a mailing strategy based on estimated
heterogeneous treatment effects in a direct mail marketing campaign. We also
provide evidence on the finite sample performance of the proposed targeted
undersmoothing procedure through a series of simulation experiments
- …