235,840 research outputs found

    On Out-of-Sample Statistics for Time-Series

    Get PDF
    This paper studies an out-of-sample statistic for time-series prediction that is analogous to the widely used R2 in-sample statistic. We propose and study methods to estimate the variance of this out-of-sample statistic. We suggest that the out-of-sample statistic is more robust to distributional and asymptotic assumptions behind many tests for in-sample statistics. Furthermore we argue that it may be more important in some cases to choose a model that generalizes as well as possible rather than choose the parameters that are closest to the true parameters. Comparative experiments are performed on a financial time-series (daily and monthly returns of the TSE300 index). The experiments are performed for varying prediction horizons and we study the relation between predictibility (out-of-sample R2), variability of the out-of-sample R2 statistic, and the prediction horizon. Cet article étudie une statistique hors-échantillon pour la prédiction de séries temporelles qui est analogue à la très utilisée statistique R2 de l'ensemble d'entraînement (in-sample). Nous proposons et étudions une méthode qui estime la variance de cette statistique hors-échantillon. Nous suggérons que la statistique hors-échantillon est plus robuste aux hypothèses distributionnelles et asymptotiques pour plusieurs tests faits pour les statistiques sur l'ensemble d'entraînement (in-sample). De plus, nous affirmons qu'il peut être plus important, dans certains cas, de choisir un modèle qui généralise le mieux possible plutôt que de choisir les paramètres qui sont le plus proches des vrais paramètres. Des expériences comparatives furent réalisées sur des séries financières (rendements journaliers et mensuels de l'indice du TSE300). Les expériences réalisées pour plusieurs horizons de prédictions, et nous étudions la relation entre la prédictibilité (hors-échantillon), la variabilité de la statistique R2 hors-échantillon, et l'horizon de prédiction.Out-of-sample statistic, time series, TSE300, Statistique hors-échantillon, séries financières, TSE300

    Latent Gaussian modeling and INLA: A review with focus on space-time applications

    Get PDF
    Bayesian hierarchical models with latent Gaussian layers have proven very flexible in capturing complex stochastic behavior and hierarchical structures in high-dimensional spatial and spatio-temporal data. Whereas simulation-based Bayesian inference through Markov Chain Monte Carlo may be hampered by slow convergence and numerical instabilities, the inferential framework of Integrated Nested Laplace Approximation (INLA) is capable to provide accurate and relatively fast analytical approximations to posterior quantities of interest. It heavily relies on the use of Gauss-Markov dependence structures to avoid the numerical bottleneck of high-dimensional nonsparse matrix computations. With a view towards space-time applications, we here review the principal theoretical concepts, model classes and inference tools within the INLA framework. Important elements to construct space-time models are certain spatial Mat\'ern-like Gauss-Markov random fields, obtained as approximate solutions to a stochastic partial differential equation. Efficient implementation of statistical inference tools for a large variety of models is available through the INLA package of the R software. To showcase the practical use of R-INLA and to illustrate its principal commands and syntax, a comprehensive simulation experiment is presented using simulated non Gaussian space-time count data with a first-order autoregressive dependence structure in time

    Internal Migration and Regional Population Dynamics in Europe: Switzerland Case Study

    Get PDF
    This paper reports on internal migration and regional population dynamics in Switzerland. It examines briefly the main population trends in the last century and then turns to more detailed examination of internal migration patterns and trends in three years, 1984, 1994 and 1996 and compares them. First, inter-cantonal migration is investigated in the context of the life course. On the communal level population change patterns and underlying in-, out- and net migration are examined. An attempt is made to link migration with such variables as population density, level of unemployment, prevailing language and with a functional classification of the urban system. The methodology used is the same as in a number of other studies, making the results as comparable as possible with the results of other studies of migration in European states (Rees and Kupiszewski 1999)

    Defining a robust biological prior from Pathway Analysis to drive Network Inference

    Get PDF
    Inferring genetic networks from gene expression data is one of the most challenging work in the post-genomic era, partly due to the vast space of possible networks and the relatively small amount of data available. In this field, Gaussian Graphical Model (GGM) provides a convenient framework for the discovery of biological networks. In this paper, we propose an original approach for inferring gene regulation networks using a robust biological prior on their structure in order to limit the set of candidate networks. Pathways, that represent biological knowledge on the regulatory networks, will be used as an informative prior knowledge to drive Network Inference. This approach is based on the selection of a relevant set of genes, called the "molecular signature", associated with a condition of interest (for instance, the genes involved in disease development). In this context, differential expression analysis is a well established strategy. However outcome signatures are often not consistent and show little overlap between studies. Thus, we will dedicate the first part of our work to the improvement of the standard process of biomarker identification to guarantee the robustness and reproducibility of the molecular signature. Our approach enables to compare the networks inferred between two conditions of interest (for instance case and control networks) and help along the biological interpretation of results. Thus it allows to identify differential regulations that occur in these conditions. We illustrate the proposed approach by applying our method to a study of breast cancer's response to treatment

    Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts Models

    Get PDF
    Mixture of Experts (MoE) are successful models for modeling heterogeneous data in many statistical learning problems including regression, clustering and classification. Generally fitted by maximum likelihood estimation via the well-known EM algorithm, their application to high-dimensional problems is still therefore challenging. We consider the problem of fitting and feature selection in MoE models, and propose a regularized maximum likelihood estimation approach that encourages sparse solutions for heterogeneous regression data models with potentially high-dimensional predictors. Unlike state-of-the art regularized MLE for MoE, the proposed modelings do not require an approximate of the penalty function. We develop two hybrid EM algorithms: an Expectation-Majorization-Maximization (EM/MM) algorithm, and an EM algorithm with coordinate ascent algorithm. The proposed algorithms allow to automatically obtaining sparse solutions without thresholding, and avoid matrix inversion by allowing univariate parameter updates. An experimental study shows the good performance of the algorithms in terms of recovering the actual sparse solutions, parameter estimation, and clustering of heterogeneous regression data

    Lower bounds for invariant statistical models with applications to principal component analysis

    Full text link
    This paper develops nonasymptotic information inequalities for the estimation of the eigenspaces of a covariance operator. These results generalize previous lower bounds for the spiked covariance model, and they show that recent upper bounds for models with decaying eigenvalues are sharp. The proof relies on lower bound techniques based on group invariance arguments which can also deal with a variety of other statistical models.Comment: 42 pages, to appear in Annales de l'Institut Henri Poincar\'e Probabilit\'es et Statistique

    Virial expansion with Feynman diagrams

    Full text link
    We present a field theoretic method for the calculation of the second and third virial coefficients b2 and b3 of 2-species fermions interacting via a contact interaction. The method is mostly analytic. We find a closed expression for b3 in terms of the 2 and 3-body T-matrices. We recover numerically, at unitarity, and also in the whole BEC-BCS crossover, previous numerical results for the third virial coefficient b3
    corecore