235,840 research outputs found
On Out-of-Sample Statistics for Time-Series
This paper studies an out-of-sample statistic for time-series prediction that is analogous to the widely used R2 in-sample statistic. We propose and study methods to estimate the variance of this out-of-sample statistic. We suggest that the out-of-sample statistic is more robust to distributional and asymptotic assumptions behind many tests for in-sample statistics. Furthermore we argue that it may be more important in some cases to choose a model that generalizes as well as possible rather than choose the parameters that are closest to the true parameters. Comparative experiments are performed on a financial time-series (daily and monthly returns of the TSE300 index). The experiments are performed for varying prediction horizons and we study the relation between predictibility (out-of-sample R2), variability of the out-of-sample R2 statistic, and the prediction horizon. Cet article étudie une statistique hors-échantillon pour la prédiction de séries temporelles qui est analogue à la très utilisée statistique R2 de l'ensemble d'entraînement (in-sample). Nous proposons et étudions une méthode qui estime la variance de cette statistique hors-échantillon. Nous suggérons que la statistique hors-échantillon est plus robuste aux hypothèses distributionnelles et asymptotiques pour plusieurs tests faits pour les statistiques sur l'ensemble d'entraînement (in-sample). De plus, nous affirmons qu'il peut être plus important, dans certains cas, de choisir un modèle qui généralise le mieux possible plutôt que de choisir les paramètres qui sont le plus proches des vrais paramètres. Des expériences comparatives furent réalisées sur des séries financières (rendements journaliers et mensuels de l'indice du TSE300). Les expériences réalisées pour plusieurs horizons de prédictions, et nous étudions la relation entre la prédictibilité (hors-échantillon), la variabilité de la statistique R2 hors-échantillon, et l'horizon de prédiction.Out-of-sample statistic, time series, TSE300, Statistique hors-échantillon, séries financières, TSE300
Latent Gaussian modeling and INLA: A review with focus on space-time applications
Bayesian hierarchical models with latent Gaussian layers have proven very
flexible in capturing complex stochastic behavior and hierarchical structures
in high-dimensional spatial and spatio-temporal data. Whereas simulation-based
Bayesian inference through Markov Chain Monte Carlo may be hampered by slow
convergence and numerical instabilities, the inferential framework of
Integrated Nested Laplace Approximation (INLA) is capable to provide accurate
and relatively fast analytical approximations to posterior quantities of
interest. It heavily relies on the use of Gauss-Markov dependence structures to
avoid the numerical bottleneck of high-dimensional nonsparse matrix
computations. With a view towards space-time applications, we here review the
principal theoretical concepts, model classes and inference tools within the
INLA framework. Important elements to construct space-time models are certain
spatial Mat\'ern-like Gauss-Markov random fields, obtained as approximate
solutions to a stochastic partial differential equation. Efficient
implementation of statistical inference tools for a large variety of models is
available through the INLA package of the R software. To showcase the practical
use of R-INLA and to illustrate its principal commands and syntax, a
comprehensive simulation experiment is presented using simulated non Gaussian
space-time count data with a first-order autoregressive dependence structure in
time
Natural resource exploitation and the role of new technology: a case-history of the UK herring industry
Internal Migration and Regional Population Dynamics in Europe: Switzerland Case Study
This paper reports on internal migration and regional population dynamics in Switzerland. It examines briefly the main population trends in the last century and then turns to more detailed examination of internal migration patterns and trends in three years, 1984, 1994 and 1996 and compares them. First, inter-cantonal migration is investigated in the context of the life course. On the communal level population change patterns and underlying in-, out- and net migration are examined. An attempt is made to link migration with such variables as population density, level of unemployment, prevailing language and with a functional classification of the urban system. The methodology used is the same as in a number of other studies, making the results as comparable as possible with the results of other studies of migration in European states (Rees and Kupiszewski 1999)
Defining a robust biological prior from Pathway Analysis to drive Network Inference
Inferring genetic networks from gene expression data is one of the most
challenging work in the post-genomic era, partly due to the vast space of
possible networks and the relatively small amount of data available. In this
field, Gaussian Graphical Model (GGM) provides a convenient framework for the
discovery of biological networks. In this paper, we propose an original
approach for inferring gene regulation networks using a robust biological prior
on their structure in order to limit the set of candidate networks.
Pathways, that represent biological knowledge on the regulatory networks,
will be used as an informative prior knowledge to drive Network Inference. This
approach is based on the selection of a relevant set of genes, called the
"molecular signature", associated with a condition of interest (for instance,
the genes involved in disease development). In this context, differential
expression analysis is a well established strategy. However outcome signatures
are often not consistent and show little overlap between studies. Thus, we will
dedicate the first part of our work to the improvement of the standard process
of biomarker identification to guarantee the robustness and reproducibility of
the molecular signature.
Our approach enables to compare the networks inferred between two conditions
of interest (for instance case and control networks) and help along the
biological interpretation of results. Thus it allows to identify differential
regulations that occur in these conditions. We illustrate the proposed approach
by applying our method to a study of breast cancer's response to treatment
Regularized Maximum Likelihood Estimation and Feature Selection in Mixtures-of-Experts Models
Mixture of Experts (MoE) are successful models for modeling heterogeneous
data in many statistical learning problems including regression, clustering and
classification. Generally fitted by maximum likelihood estimation via the
well-known EM algorithm, their application to high-dimensional problems is
still therefore challenging. We consider the problem of fitting and feature
selection in MoE models, and propose a regularized maximum likelihood
estimation approach that encourages sparse solutions for heterogeneous
regression data models with potentially high-dimensional predictors. Unlike
state-of-the art regularized MLE for MoE, the proposed modelings do not require
an approximate of the penalty function. We develop two hybrid EM algorithms: an
Expectation-Majorization-Maximization (EM/MM) algorithm, and an EM algorithm
with coordinate ascent algorithm. The proposed algorithms allow to
automatically obtaining sparse solutions without thresholding, and avoid matrix
inversion by allowing univariate parameter updates. An experimental study shows
the good performance of the algorithms in terms of recovering the actual sparse
solutions, parameter estimation, and clustering of heterogeneous regression
data
Lower bounds for invariant statistical models with applications to principal component analysis
This paper develops nonasymptotic information inequalities for the estimation
of the eigenspaces of a covariance operator. These results generalize previous
lower bounds for the spiked covariance model, and they show that recent upper
bounds for models with decaying eigenvalues are sharp. The proof relies on
lower bound techniques based on group invariance arguments which can also deal
with a variety of other statistical models.Comment: 42 pages, to appear in Annales de l'Institut Henri Poincar\'e
Probabilit\'es et Statistique
Virial expansion with Feynman diagrams
We present a field theoretic method for the calculation of the second and
third virial coefficients b2 and b3 of 2-species fermions interacting via a
contact interaction. The method is mostly analytic. We find a closed expression
for b3 in terms of the 2 and 3-body T-matrices. We recover numerically, at
unitarity, and also in the whole BEC-BCS crossover, previous numerical results
for the third virial coefficient b3
Graphs and notes on the economic situation in the Community = Graphiques et notes rapides sur la conjoncture dans la Communaute No. 4, 1971
- …
