448 research outputs found

    Portfolio Diversification and Value at Risk Under Thick-Tailedness

    Get PDF
    We present a unified approach to value at risk analysis under heavy-tailedness using new majorization theory for linear combinations of thick-tailed random variables that we develop. Among other results, we show that the stylized fact that portfolio diversification is always preferable is reversed for extremely heavy-tailed risks or returns. The stylized facts on diversification are nevertheless robust to thick-tailedness of risks or returns as long as their distributions are not extremely long-tailed. We further demonstrate that the value at risk is a coherent measure of risk if distributions of risks are not extremely heavy-tailed. However, coherency of the value at risk is always violated under extreme thick-tailedness. Extensions of the results to the case of dependence, including convolutions of a-symmetric distributions and models with common stochs are provided.

    Tests for High Dimensional Generalized Linear Models

    Get PDF
    We consider testing regression coefficients in high dimensional generalized linear models. An investigation of the test of Goeman et al. (2011) is conducted, which reveals that if the inverse of the link function is unbounded, the high dimensionality in the covariates can impose adverse impacts on the power of the test. We propose a test formation which can avoid the adverse impact of the high dimensionality. When the inverse of the link function is bounded such as the logistic or probit regression, the proposed test is as good as Goeman et al. (2011)'s test. The proposed tests provide p-values for testing significance for gene-sets as demonstrated in a case study on an acute lymphoblastic leukemia dataset.Comment: The research paper was stole by someone last November and illegally submitted to arXiv by a person named gong zi jiang nan. We have asked arXiv to withdraw the unfinished paper [arXiv:1311.4043] and it was removed last December. We have collected enough evidences to identify the person and Peking University has begun to investigate the plagiarize

    Around stability for functional inequalities

    Get PDF
    Les inégalités fonctionnelles sont des inégalités qui encodent beaucoup d'information, tant de nature probabiliste (concentration de la mesure), qu'analytique (théorie spectrale des opérateurs) ou encore géométrique (profil isopérimétrique). L'inégalité de Poincaré en est un exemple fondamental. Dans cette thèse, nous obtenons des résultats de stabilité dans le cadre d'hypothèses de normalisation de moments, ainsi que dans le cadre de conditions de courbure-dimension. Un résultat de stabilité est une façon de quantifier la différence entre deux situations dans lesquelles les mêmes inégalités fonctionnelles sont presque vérifiées. Les résultats de stabilité obtenus dans cette thèse sont en particulier basés sur la méthode de Stein, qui est une méthode en plein développement ces dernières années, provenant du domaine des statistiques et permettant d'établir des estimations quantitatives sur des résultats de convergence. Par ailleurs, une partie de cette thèse est consacrée à l'étude des constantes optimales des inégalités de Bobkov, qui sont des inégalités fonctionnelles à caractère isopérimétrique.Functional inequalities are inequalities that encode a lot of information, both of a probabilistic (the concentration of measure phenomenon), analytical (the spectral theory of operators) and geometric (isoperimetric profile) nature. The Poincaré inequality is a fundamental example. In this thesis, we obtain stability results under moment normalisation assumptions, as well as under curvature-dimension conditions. A stability result is a way to quantify the difference between two situations where almost the same functional inequalities are verified. The stability results obtained in this thesis are in particular based on the Stein method, which is a method in full development in recent years, coming from the field of statistics and allowing to establish quantitative estimates on convergence results. In addition, a part of this thesis is devoted to the study of the optimal constants of Bobkov inequalities, which are functional inequalities of isoperimetric character

    Optimization with Sparsity-Inducing Penalties

    Get PDF
    Sparse estimation methods are aimed at using or obtaining parsimonious representations of data or models. They were first dedicated to linear variable selection but numerous extensions have now emerged such as structured sparsity or kernel selection. It turns out that many of the related estimation problems can be cast as convex optimization problems by regularizing the empirical risk with appropriate non-smooth norms. The goal of this paper is to present from a general perspective optimization tools and techniques dedicated to such sparsity-inducing penalties. We cover proximal methods, block-coordinate descent, reweighted â„“2\ell_2-penalized techniques, working-set and homotopy methods, as well as non-convex formulations and extensions, and provide an extensive set of experiments to compare various algorithms from a computational point of view

    Testing Statistical Hypotheses for Latent Variable Models and Some Computational Issues

    Get PDF
    In this dissertation, I address unorthodox statistical problems concerning goodness-of-fit tests in the latent variable context and efficient statistical computations. In epidemiological and biomedical studies observations with measurement errors are quite common, especially when it is difficult to calibrate true signals accurately. In this first problem, I develop a statistical test for testing equality of two distributions when the observed contaminated data follow the classical additive measurement error model. The fact is that the two-sample homogeneity tests, such as Kolmogorov-Smirnov, Anderson-Darling, or von Mises test, are not consistent when observations are subject to measurement error. To develop a consistent test, first the characteristic functions of unobservable true random variables are estimated from the contaminated data, and then the test statistic is defined as the integrated difference between the two estimated characteristic functions. It is shown that when the sample size is large and the null hypothesis holds, the test statistic converges to an integral of a squared Gaussian process. However, enumeration of this distribution to obtain the rejection region is not simple. Therefore, I propose a bootstrap approach to compute the p-value of the test statistic. The operating characteristics of the proposed test is assessed and compared with the other approaches via extensive simulation studies. The proposed method is then applied to analyze the National Health and Nutrition Examination Survey (NHANES) dataset. Although researchers considered estimation of the regression parameters in the presence of exposure measurement error, this testing problem is completely new and no one has considered it before. In the next problem, I consider the stochastic frontier model (SFM) which is a widely used model for measuring firms’ efficiency. In productivity or cost studies in the field of econometrics, there is a discrepancy between the theoretically optimal product and the actual output for a certain amount of inputs and this gap is called technical inefficiency. To assess this inefficiency, the stochastic frontier model is in use to include this gap as a latent variable in addition to the usual statistical noise. Since it is unable to observe this gap, estimation and inference depend on the distributional assumption of the technical inefficiency term. Usually, an exponential or half-normal distribution is widely assumed for the inefficiency term. In that sense, I develop a Bayesian test for testing whether this parametric assumption is correct. I construct a broad semiparametric family which approximate or contain the true distribution as an alternative and then define a Bayes factor. I show the Bayes factor consistency under certain conditions and present the finite sample performance via Monte-Carlo simulations. The second part of my dissertation is about statistical computational problems. Frequentist standard errors are of interest to evaluate uncertainty of an estimator and utilized for many statistical inference problems. In this dissertation, I consider standard error calculation for Bayes estimators. Except some hypothetical scenarios, estimating frequentist variability of any estimator possibly involves bootstrapping to approximate the sampling distribution of the estimator. In addition, for a Bayesian modeling combined with Markov chain Monte Carlo (MCMC) and bootstrap the computation of the standard error of Bayes estimator is computationally expensive and impractical. Specifically, repeated application of the MCMC on each of the bootstrapped data make everything computationally inefficient. To overcome this difficulty, I propose a clever use of the importance sampling technique to reduce the computational burden. I apply this proposed technique to several examples including logistic regression, linear measurement error model, Weibull regression model and vector autoregressive model. In the second computational problem, I explore the binary regression with flexible skew-probit link function which contains traditional probit link function as a special case. The skew-probit model is useful for modelling success probability of binary response or count data where the success probability is not a symmetric function of continuous regressors. In this topic, I investigate the parameter identifiability of skew-probit model. I then demonstrate that the maximum likelihood estimator (MLE) of the skewness parameter is highly biased. I develop a penalized likelihood approach based on three penalty functions to reduce the finite sample bias of the MLE of the skew-probit model. The performances of each penalized MLE are compared through extensive simulations and I analyze the heart-disease data using the proposed approaches
    • …
    corecore