1,450 research outputs found
Generalized Network Psychometrics: Combining Network and Latent Variable Models
We introduce the network model as a formal psychometric model,
conceptualizing the covariance between psychometric indicators as resulting
from pairwise interactions between observable variables in a network structure.
This contrasts with standard psychometric models, in which the covariance
between test items arises from the influence of one or more common latent
variables. Here, we present two generalizations of the network model that
encompass latent variable structures, establishing network modeling as parts of
the more general framework of Structural Equation Modeling (SEM). In the first
generalization, we model the covariance structure of latent variables as a
network. We term this framework Latent Network Modeling (LNM) and show that,
with LNM, a unique structure of conditional independence relationships between
latent variables can be obtained in an explorative manner. In the second
generalization, the residual variance-covariance structure of indicators is
modeled as a network. We term this generalization Residual Network Modeling
(RNM) and show that, within this framework, identifiable models can be obtained
in which local independence is structurally violated. These generalizations
allow for a general modeling framework that can be used to fit, and compare,
SEM models, network models, and the RNM and LNM generalizations. This
methodology has been implemented in the free-to-use software package lvnet,
which contains confirmatory model testing as well as two exploratory search
algorithms: stepwise search algorithms for low-dimensional datasets and
penalized maximum likelihood estimation for larger datasets. We show in
simulation studies that these search algorithms performs adequately in
identifying the structure of the relevant residual or latent networks. We
further demonstrate the utility of these generalizations in an empirical
example on a personality inventory dataset.Comment: Published in Psychometrik
Non Parametric Models with Instrumental Variables
This paper gives a survey of econometric models characterized by a relation between observable and unobservable random elements where these unobservable terms are assumed to be independent of another set of observable variables called instrumental variables. This kind of specification is usefull to address the question of endogeneity or of selection bias for example. These models are treated non parametrically and in all the example we consider the functional parameter of interest is defined as the solution of a linear or non linear integral equation. The estimation procedure then requires to solve a (generally ill-posed) inverse problem. We illustrate the main questions (construction of the equation, identification, numerical solution, asymptotic properties, selection of the regularization parameter) by the different models we present.
Interpretable statistics for complex modelling: quantile and topological learning
As the complexity of our data increased exponentially in the last decades, so has our
need for interpretable features. This thesis revolves around two paradigms to approach
this quest for insights.
In the first part we focus on parametric models, where the problem of interpretability
can be seen as a “parametrization selection”. We introduce a quantile-centric
parametrization and we show the advantages of our proposal in the context of regression,
where it allows to bridge the gap between classical generalized linear (mixed)
models and increasingly popular quantile methods.
The second part of the thesis, concerned with topological learning, tackles the
problem from a non-parametric perspective. As topology can be thought of as a way
of characterizing data in terms of their connectivity structure, it allows to represent
complex and possibly high dimensional through few features, such as the number of
connected components, loops and voids. We illustrate how the emerging branch of
statistics devoted to recovering topological structures in the data, Topological Data
Analysis, can be exploited both for exploratory and inferential purposes with a special
emphasis on kernels that preserve the topological information in the data.
Finally, we show with an application how these two approaches can borrow strength
from one another in the identification and description of brain activity through fMRI
data from the ABIDE project
Inference for High-Dimensional Sparse Econometric Models
This article is about estimation and inference methods for high dimensional
sparse (HDS) regression models in econometrics. High dimensional sparse models
arise in situations where many regressors (or series terms) are available and
the regression function is well-approximated by a parsimonious, yet unknown set
of regressors. The latter condition makes it possible to estimate the entire
regression function effectively by searching for approximately the right set of
regressors. We discuss methods for identifying this set of regressors and
estimating their coefficients based on -penalization and describe key
theoretical results. In order to capture realistic practical situations, we
expressly allow for imperfect selection of regressors and study the impact of
this imperfect selection on estimation and inference results. We focus the main
part of the article on the use of HDS models and methods in the instrumental
variables model and the partially linear model. We present a set of novel
inference results for these models and illustrate their use with applications
to returns to schooling and growth regression
Feature selection guided by structural information
In generalized linear regression problems with an abundant number of
features, lasso-type regularization which imposes an -constraint on the
regression coefficients has become a widely established technique. Deficiencies
of the lasso in certain scenarios, notably strongly correlated design, were
unmasked when Zou and Hastie [J. Roy. Statist. Soc. Ser. B 67 (2005) 301--320]
introduced the elastic net. In this paper we propose to extend the elastic net
by admitting general nonnegative quadratic constraints as a second form of
regularization. The generalized ridge-type constraint will typically make use
of the known association structure of features, for example, by using temporal-
or spatial closeness. We study properties of the resulting "structured elastic
net" regression estimation procedure, including basic asymptotics and the issue
of model selection consistency. In this vein, we provide an analog to the
so-called "irrepresentable condition" which holds for the lasso. Moreover, we
outline algorithmic solutions for the structured elastic net within the
generalized linear model family. The rationale and the performance of our
approach is illustrated by means of simulated and real world data, with a focus
on signal regression.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS302 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Robustness and Regularization of Support Vector Machines
We consider regularized support vector machines (SVMs) and show that they are
precisely equivalent to a new robust optimization formulation. We show that
this equivalence of robust optimization and regularization has implications for
both algorithms, and analysis. In terms of algorithms, the equivalence suggests
more general SVM-like algorithms for classification that explicitly build in
protection to noise, and at the same time control overfitting. On the analysis
front, the equivalence of robustness and regularization, provides a robust
optimization interpretation for the success of regularized SVMs. We use the
this new robustness interpretation of SVMs to give a new proof of consistency
of (kernelized) SVMs, thus establishing robustness as the reason regularized
SVMs generalize well
- …